Overview

Brought to you by YData

Dataset statistics

Number of variables172
Number of observations2361473
Missing cells246278207
Missing cells (%)60.6%
Total size in memory3.0 GiB
Average record size in memory1.3 KiB

Variable types

Text172

Dataset

DescriptionNMNH Extant Specimen Records (USNM, US) 0049395-241126133413365
URLhttps://doi.org/10.15468/dl.42mnjx

Alerts

license has constant value "CC0_1_0" Constant
publisher has constant value "National Museum of Natural History, Smithsonian Institution" Constant
datasetName has constant value "NMNH Extant Biology" Constant
eventType has constant value "Baffin Island" Constant
samplingEffort has constant value "67.0" Constant
fieldNotes has constant value "-63.0" Constant
municipality has constant value "-53.33" Constant
coordinatePrecision has constant value "Leeward Is." Constant
geologicalContextID has constant value "3" Constant
earliestEonOrLowestEonothem has constant value "29" Constant
earliestEraOrLowestErathem has constant value "Plantae" Constant
latestEraOrHighestErathem has constant value "Tracheophyta" Constant
earliestPeriodOrLowestSystem has constant value "Magnoliopsida" Constant
earliestEpochOrLowestSeries has constant value "5410907" Constant
latestAgeOrHighestStage has constant value "North Atlantic Ocean" Constant
lowestBiostratigraphicZone has constant value "Scharf, U." Constant
identificationID has constant value "Baja California Norte" Constant
dateIdentified has constant value "Asterales" Constant
identificationReferences has constant value "Guatteria punctata (Aubl.) R.A.Howard" Constant
scientificNameID has constant value "69.0" Constant
parentNameUsageID has constant value "Campanula" Constant
originalNameUsageID has constant value "Plantae, Dicotyledonae (basal), Magnoliales, Annonaceae, Annonoideae" Constant
nameAccordingToID has constant value "Plantae" Constant
nameAccordingTo has constant value "6" Constant
relativeOrganismQuantity has constant value "3034046" Constant
catalogNumber has 213212 (9.0%) missing values Missing
recordNumber has 1045439 (44.3%) missing values Missing
recordedBy has 498671 (21.1%) missing values Missing
sex has 2009611 (85.1%) missing values Missing
lifeStage has 2107148 (89.2%) missing values Missing
preparations has 1223408 (51.8%) missing values Missing
associatedSequences has 2358372 (99.9%) missing values Missing
occurrenceRemarks has 2047572 (86.7%) missing values Missing
verbatimLabel has 2361471 (> 99.9%) missing values Missing
materialSampleID has 2361471 (> 99.9%) missing values Missing
eventType has 2361472 (> 99.9%) missing values Missing
fieldNumber has 2164715 (91.7%) missing values Missing
eventDate has 419648 (17.8%) missing values Missing
startDayOfYear has 669491 (28.4%) missing values Missing
endDayOfYear has 669490 (28.4%) missing values Missing
year has 423106 (17.9%) missing values Missing
month has 542654 (23.0%) missing values Missing
day has 762160 (32.3%) missing values Missing
verbatimEventDate has 1255739 (53.2%) missing values Missing
habitat has 2177646 (92.2%) missing values Missing
samplingEffort has 2361472 (> 99.9%) missing values Missing
fieldNotes has 2361472 (> 99.9%) missing values Missing
locationID has 2084512 (88.3%) missing values Missing
higherGeography has 73521 (3.1%) missing values Missing
continent has 411637 (17.4%) missing values Missing
waterBody has 1923759 (81.5%) missing values Missing
islandGroup has 2309219 (97.8%) missing values Missing
island has 2204401 (93.3%) missing values Missing
countryCode has 95309 (4.0%) missing values Missing
stateProvince has 637065 (27.0%) missing values Missing
county has 1825433 (77.3%) missing values Missing
municipality has 2361472 (> 99.9%) missing values Missing
locality has 337166 (14.3%) missing values Missing
verbatimElevation has 2293088 (97.1%) missing values Missing
verbatimDepth has 2347005 (99.4%) missing values Missing
decimalLatitude has 1649765 (69.9%) missing values Missing
decimalLongitude has 1649765 (69.9%) missing values Missing
coordinateUncertaintyInMeters has 2318351 (98.2%) missing values Missing
coordinatePrecision has 2361472 (> 99.9%) missing values Missing
pointRadiusSpatialFit has 2361470 (> 99.9%) missing values Missing
verbatimCoordinateSystem has 2103318 (89.1%) missing values Missing
verbatimSRS has 2361467 (> 99.9%) missing values Missing
footprintSRS has 2361470 (> 99.9%) missing values Missing
footprintSpatialFit has 2361469 (> 99.9%) missing values Missing
georeferencedBy has 2361464 (> 99.9%) missing values Missing
georeferencedDate has 2361470 (> 99.9%) missing values Missing
georeferenceProtocol has 2055868 (87.1%) missing values Missing
georeferenceSources has 2361471 (> 99.9%) missing values Missing
georeferenceRemarks has 2309427 (97.8%) missing values Missing
geologicalContextID has 2361472 (> 99.9%) missing values Missing
earliestEonOrLowestEonothem has 2361472 (> 99.9%) missing values Missing
latestEonOrHighestEonothem has 2361470 (> 99.9%) missing values Missing
earliestEraOrLowestErathem has 2361471 (> 99.9%) missing values Missing
latestEraOrHighestErathem has 2361471 (> 99.9%) missing values Missing
earliestPeriodOrLowestSystem has 2361471 (> 99.9%) missing values Missing
latestPeriodOrHighestSystem has 2361471 (> 99.9%) missing values Missing
earliestEpochOrLowestSeries has 2361472 (> 99.9%) missing values Missing
latestEpochOrHighestSeries has 2361465 (> 99.9%) missing values Missing
earliestAgeOrLowestStage has 2361468 (> 99.9%) missing values Missing
latestAgeOrHighestStage has 2361472 (> 99.9%) missing values Missing
lowestBiostratigraphicZone has 2361472 (> 99.9%) missing values Missing
highestBiostratigraphicZone has 2361470 (> 99.9%) missing values Missing
lithostratigraphicTerms has 2361463 (> 99.9%) missing values Missing
group has 2361468 (> 99.9%) missing values Missing
formation has 2361470 (> 99.9%) missing values Missing
member has 2361471 (> 99.9%) missing values Missing
bed has 2361466 (> 99.9%) missing values Missing
identificationID has 2361472 (> 99.9%) missing values Missing
verbatimIdentification has 2361470 (> 99.9%) missing values Missing
identificationQualifier has 2352474 (99.6%) missing values Missing
typeStatus has 2274525 (96.3%) missing values Missing
identifiedBy has 1955406 (82.8%) missing values Missing
identifiedByID has 2361470 (> 99.9%) missing values Missing
dateIdentified has 2361472 (> 99.9%) missing values Missing
identificationReferences has 2361472 (> 99.9%) missing values Missing
identificationVerificationStatus has 2361466 (> 99.9%) missing values Missing
identificationRemarks has 2361467 (> 99.9%) missing values Missing
taxonID has 2361471 (> 99.9%) missing values Missing
scientificNameID has 2361472 (> 99.9%) missing values Missing
parentNameUsageID has 2361472 (> 99.9%) missing values Missing
originalNameUsageID has 2361472 (> 99.9%) missing values Missing
nameAccordingToID has 2361472 (> 99.9%) missing values Missing
namePublishedInID has 2361469 (> 99.9%) missing values Missing
taxonConceptID has 2361471 (> 99.9%) missing values Missing
acceptedNameUsage has 2361470 (> 99.9%) missing values Missing
parentNameUsage has 2361469 (> 99.9%) missing values Missing
originalNameUsage has 2361471 (> 99.9%) missing values Missing
nameAccordingTo has 2361471 (> 99.9%) missing values Missing
namePublishedIn has 2361470 (> 99.9%) missing values Missing
namePublishedInYear has 2361470 (> 99.9%) missing values Missing
class has 138563 (5.9%) missing values Missing
order has 145729 (6.2%) missing values Missing
superfamily has 2361471 (> 99.9%) missing values Missing
family has 52497 (2.2%) missing values Missing
subfamily has 2361471 (> 99.9%) missing values Missing
subtribe has 2361470 (> 99.9%) missing values Missing
genus has 120652 (5.1%) missing values Missing
genericName has 120743 (5.1%) missing values Missing
subgenus has 2361470 (> 99.9%) missing values Missing
infragenericEpithet has 2361471 (> 99.9%) missing values Missing
specificEpithet has 306545 (13.0%) missing values Missing
infraspecificEpithet has 2138642 (90.6%) missing values Missing
cultivarEpithet has 2361470 (> 99.9%) missing values Missing
verbatimTaxonRank has 2361470 (> 99.9%) missing values Missing
vernacularName has 2361469 (> 99.9%) missing values Missing
nomenclaturalCode has 2361468 (> 99.9%) missing values Missing
nomenclaturalStatus has 2361469 (> 99.9%) missing values Missing
taxonRemarks has 2361470 (> 99.9%) missing values Missing
elevation has 1813940 (76.8%) missing values Missing
elevationAccuracy has 2160162 (91.5%) missing values Missing
depth has 2098489 (88.9%) missing values Missing
depthAccuracy has 2120420 (89.8%) missing values Missing
distanceFromCentroidInMeters has 2356831 (99.8%) missing values Missing
mediaType has 863248 (36.6%) missing values Missing
classKey has 138564 (5.9%) missing values Missing
orderKey has 145723 (6.2%) missing values Missing
familyKey has 52492 (2.2%) missing values Missing
genusKey has 120649 (5.1%) missing values Missing
subgenusKey has 2361466 (> 99.9%) missing values Missing
speciesKey has 306496 (13.0%) missing values Missing
species has 306502 (13.0%) missing values Missing
verbatimScientificName has 94306 (4.0%) missing values Missing
typifiedName has 2361471 (> 99.9%) missing values Missing
repatriated has 92313 (3.9%) missing values Missing
relativeOrganismQuantity has 2361472 (> 99.9%) missing values Missing
projectId has 2361467 (> 99.9%) missing values Missing
gbifRegion has 114374 (4.8%) missing values Missing
level0Gid has 1911133 (80.9%) missing values Missing
level0Name has 1911134 (80.9%) missing values Missing
level1Gid has 1912772 (81.0%) missing values Missing
level1Name has 1912766 (81.0%) missing values Missing
level2Gid has 1927752 (81.6%) missing values Missing
level2Name has 1927850 (81.6%) missing values Missing
level3Gid has 2259567 (95.7%) missing values Missing
level3Name has 2260777 (95.7%) missing values Missing
iucnRedListCategory has 383090 (16.2%) missing values Missing
gbifID has unique values Unique
occurrenceID has unique values Unique

Reproduction

Analysis started2025-01-08 22:43:58.844868
Analysis finished2025-01-08 22:46:14.905392
Duration2 minutes and 16.06 seconds
Software versionydata-profiling vv4.12.1
Download configurationconfig.json

Variables

gbifID
Text

Unique 

Distinct2361473
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size18.0 MiB
2025-01-08T17:46:16.009266image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters23614730
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2361473 ?
Unique (%)100.0%

Sample

1st row1321585620
2nd row2452323322
3rd row1321585780
4th row1320143695
5th row2397792128
ValueCountFrequency (%)
1321585620 1
 
< 0.1%
1320155873 1
 
< 0.1%
2549497867 1
 
< 0.1%
1320145763 1
 
< 0.1%
1321586167 1
 
< 0.1%
1321585780 1
 
< 0.1%
1320143695 1
 
< 0.1%
2397792128 1
 
< 0.1%
1320143630 1
 
< 0.1%
1321585990 1
 
< 0.1%
Other values (2361463) 2361463
> 99.9%
2025-01-08T17:46:17.136803image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 4191710
17.8%
3 3357958
14.2%
2 3144111
13.3%
5 1911139
8.1%
8 1885660
8.0%
7 1878917
8.0%
0 1861726
7.9%
4 1811401
7.7%
9 1790742
7.6%
6 1781366
7.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 23614730
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 4191710
17.8%
3 3357958
14.2%
2 3144111
13.3%
5 1911139
8.1%
8 1885660
8.0%
7 1878917
8.0%
0 1861726
7.9%
4 1811401
7.7%
9 1790742
7.6%
6 1781366
7.5%

Most occurring scripts

ValueCountFrequency (%)
Common 23614730
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 4191710
17.8%
3 3357958
14.2%
2 3144111
13.3%
5 1911139
8.1%
8 1885660
8.0%
7 1878917
8.0%
0 1861726
7.9%
4 1811401
7.7%
9 1790742
7.6%
6 1781366
7.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 23614730
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 4191710
17.8%
3 3357958
14.2%
2 3144111
13.3%
5 1911139
8.1%
8 1885660
8.0%
7 1878917
8.0%
0 1861726
7.9%
4 1811401
7.7%
9 1790742
7.6%
6 1781366
7.5%

license
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size18.0 MiB
2025-01-08T17:46:17.190381image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters16530311
Distinct characters4
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCC0_1_0
2nd rowCC0_1_0
3rd rowCC0_1_0
4th rowCC0_1_0
5th rowCC0_1_0
ValueCountFrequency (%)
cc0_1_0 2361473
100.0%
2025-01-08T17:46:17.275962image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
C 4722946
28.6%
0 4722946
28.6%
_ 4722946
28.6%
1 2361473
14.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 7084419
42.9%
Uppercase Letter 4722946
28.6%
Connector Punctuation 4722946
28.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 4722946
66.7%
1 2361473
33.3%
Uppercase Letter
ValueCountFrequency (%)
C 4722946
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 4722946
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 11807365
71.4%
Latin 4722946
 
28.6%

Most frequent character per script

Common
ValueCountFrequency (%)
0 4722946
40.0%
_ 4722946
40.0%
1 2361473
20.0%
Latin
ValueCountFrequency (%)
C 4722946
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 16530311
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
C 4722946
28.6%
0 4722946
28.6%
_ 4722946
28.6%
1 2361473
14.3%
Distinct231346
Distinct (%)9.8%
Missing0
Missing (%)0.0%
Memory size18.0 MiB
2025-01-08T17:46:17.428177image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length20
Median length20
Mean length20
Min length20

Characters and Unicode

Total characters47229460
Distinct characters14
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique103346 ?
Unique (%)4.4%

Sample

1st row2023-05-10T09:22:00Z
2nd row2022-01-03T14:31:00Z
3rd row2022-08-17T11:23:00Z
4th row2022-12-30T12:34:00Z
5th row2019-07-10T10:37:00Z
ValueCountFrequency (%)
2017-04-17t11:48:00z 2463
 
0.1%
2017-04-17t11:49:00z 2417
 
0.1%
2024-09-25t13:44:00z 2393
 
0.1%
2024-09-25t13:46:00z 2237
 
0.1%
2017-04-17t11:50:00z 2230
 
0.1%
2024-09-25t17:07:00z 2222
 
0.1%
2024-09-25t17:02:00z 2213
 
0.1%
2017-04-17t11:47:00z 2206
 
0.1%
2024-09-25t13:45:00z 2193
 
0.1%
2024-09-25t17:05:00z 2193
 
0.1%
Other values (231336) 2338706
99.0%
2025-01-08T17:46:17.646449image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 11717324
24.8%
2 6603970
14.0%
1 5691162
12.1%
- 4722946
10.0%
: 4722946
10.0%
T 2361473
 
5.0%
Z 2361473
 
5.0%
4 1546513
 
3.3%
3 1539433
 
3.3%
5 1442686
 
3.1%
Other values (4) 4519534
 
9.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 33060622
70.0%
Dash Punctuation 4722946
 
10.0%
Other Punctuation 4722946
 
10.0%
Uppercase Letter 4722946
 
10.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 11717324
35.4%
2 6603970
20.0%
1 5691162
17.2%
4 1546513
 
4.7%
3 1539433
 
4.7%
5 1442686
 
4.4%
9 1388747
 
4.2%
8 1153529
 
3.5%
7 1074741
 
3.3%
6 902517
 
2.7%
Uppercase Letter
ValueCountFrequency (%)
T 2361473
50.0%
Z 2361473
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 4722946
100.0%
Other Punctuation
ValueCountFrequency (%)
: 4722946
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 42506514
90.0%
Latin 4722946
 
10.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 11717324
27.6%
2 6603970
15.5%
1 5691162
13.4%
- 4722946
11.1%
: 4722946
11.1%
4 1546513
 
3.6%
3 1539433
 
3.6%
5 1442686
 
3.4%
9 1388747
 
3.3%
8 1153529
 
2.7%
Other values (2) 1977258
 
4.7%
Latin
ValueCountFrequency (%)
T 2361473
50.0%
Z 2361473
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 47229460
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 11717324
24.8%
2 6603970
14.0%
1 5691162
12.1%
- 4722946
10.0%
: 4722946
10.0%
T 2361473
 
5.0%
Z 2361473
 
5.0%
4 1546513
 
3.3%
3 1539433
 
3.3%
5 1442686
 
3.1%
Other values (4) 4519534
 
9.6%

publisher
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size18.0 MiB
2025-01-08T17:46:17.711024image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length59
Median length59
Mean length59
Min length59

Characters and Unicode

Total characters139326907
Distinct characters21
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNational Museum of Natural History, Smithsonian Institution
2nd rowNational Museum of Natural History, Smithsonian Institution
3rd rowNational Museum of Natural History, Smithsonian Institution
4th rowNational Museum of Natural History, Smithsonian Institution
5th rowNational Museum of Natural History, Smithsonian Institution
ValueCountFrequency (%)
national 2361473
14.3%
museum 2361473
14.3%
of 2361473
14.3%
natural 2361473
14.3%
history 2361473
14.3%
smithsonian 2361473
14.3%
institution 2361473
14.3%
2025-01-08T17:46:17.813622image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
t 16530311
11.9%
i 14168838
10.2%
14168838
10.2%
a 11807365
 
8.5%
o 11807365
 
8.5%
n 11807365
 
8.5%
s 9445892
 
6.8%
u 9445892
 
6.8%
r 4722946
 
3.4%
m 4722946
 
3.4%
Other values (11) 30699149
22.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 108627758
78.0%
Space Separator 14168838
 
10.2%
Uppercase Letter 14168838
 
10.2%
Other Punctuation 2361473
 
1.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 16530311
15.2%
i 14168838
13.0%
a 11807365
10.9%
o 11807365
10.9%
n 11807365
10.9%
s 9445892
8.7%
u 9445892
8.7%
r 4722946
 
4.3%
m 4722946
 
4.3%
l 4722946
 
4.3%
Other values (4) 9445892
8.7%
Uppercase Letter
ValueCountFrequency (%)
N 4722946
33.3%
M 2361473
16.7%
H 2361473
16.7%
S 2361473
16.7%
I 2361473
16.7%
Space Separator
ValueCountFrequency (%)
14168838
100.0%
Other Punctuation
ValueCountFrequency (%)
, 2361473
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 122796596
88.1%
Common 16530311
 
11.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 16530311
13.5%
i 14168838
11.5%
a 11807365
9.6%
o 11807365
9.6%
n 11807365
9.6%
s 9445892
 
7.7%
u 9445892
 
7.7%
r 4722946
 
3.8%
m 4722946
 
3.8%
N 4722946
 
3.8%
Other values (9) 23614730
19.2%
Common
ValueCountFrequency (%)
14168838
85.7%
, 2361473
 
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 139326907
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t 16530311
11.9%
i 14168838
10.2%
14168838
10.2%
a 11807365
 
8.5%
o 11807365
 
8.5%
n 11807365
 
8.5%
s 9445892
 
6.8%
u 9445892
 
6.8%
r 4722946
 
3.4%
m 4722946
 
3.4%
Other values (11) 30699149
22.0%
Distinct37
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size18.0 MiB
2025-01-08T17:46:17.877048image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length29
Median length29
Mean length28.98757852
Min length2

Characters and Unicode

Total characters68453384
Distinct characters40
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8 ?
Unique (%)< 0.1%

Sample

1st rowurn:lsid:biocol.org:col:34871
2nd rowurn:lsid:biocol.org:col:15463
3rd rowurn:lsid:biocol.org:col:34871
4th rowurn:lsid:biocol.org:col:34871
5th rowurn:lsid:biocol.org:col:34871
ValueCountFrequency (%)
urn:lsid:biocol.org:col:34871 1210999
51.3%
urn:lsid:biocol.org:col:15463 1149318
48.7%
nsmt 255
 
< 0.1%
uam 205
 
< 0.1%
rmnh 94
 
< 0.1%
nrm 92
 
< 0.1%
nmv 65
 
< 0.1%
rcs 61
 
< 0.1%
zmmu 46
 
< 0.1%
nmsz 44
 
< 0.1%
Other values (27) 294
 
< 0.1%
2025-01-08T17:46:17.990044image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
o 9441268
13.8%
: 9441268
13.8%
l 7080951
 
10.3%
c 4720634
 
6.9%
i 4720634
 
6.9%
r 4720634
 
6.9%
s 2360317
 
3.4%
d 2360317
 
3.4%
b 2360317
 
3.4%
n 2360317
 
3.4%
Other values (30) 18886727
27.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 44846023
65.5%
Other Punctuation 11801585
 
17.2%
Decimal Number 11801585
 
17.2%
Uppercase Letter 4191
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
M 1128
26.9%
N 681
16.2%
S 459
11.0%
A 347
 
8.3%
U 306
 
7.3%
T 255
 
6.1%
R 251
 
6.0%
H 151
 
3.6%
C 125
 
3.0%
Z 116
 
2.8%
Other values (10) 372
 
8.9%
Lowercase Letter
ValueCountFrequency (%)
o 9441268
21.1%
l 7080951
15.8%
c 4720634
10.5%
i 4720634
10.5%
r 4720634
10.5%
s 2360317
 
5.3%
d 2360317
 
5.3%
b 2360317
 
5.3%
n 2360317
 
5.3%
g 2360317
 
5.3%
Decimal Number
ValueCountFrequency (%)
3 2360317
20.0%
4 2360317
20.0%
1 2360317
20.0%
8 1210999
10.3%
7 1210999
10.3%
5 1149318
9.7%
6 1149318
9.7%
Other Punctuation
ValueCountFrequency (%)
: 9441268
80.0%
. 2360317
 
20.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 44850214
65.5%
Common 23603170
34.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 9441268
21.1%
l 7080951
15.8%
c 4720634
10.5%
i 4720634
10.5%
r 4720634
10.5%
s 2360317
 
5.3%
d 2360317
 
5.3%
b 2360317
 
5.3%
n 2360317
 
5.3%
g 2360317
 
5.3%
Other values (21) 2364508
 
5.3%
Common
ValueCountFrequency (%)
: 9441268
40.0%
. 2360317
 
10.0%
3 2360317
 
10.0%
4 2360317
 
10.0%
1 2360317
 
10.0%
8 1210999
 
5.1%
7 1210999
 
5.1%
5 1149318
 
4.9%
6 1149318
 
4.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 68453384
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 9441268
13.8%
: 9441268
13.8%
l 7080951
 
10.3%
c 4720634
 
6.9%
i 4720634
 
6.9%
r 4720634
 
6.9%
s 2360317
 
3.4%
d 2360317
 
3.4%
b 2360317
 
3.4%
n 2360317
 
3.4%
Other values (30) 18886727
27.6%
Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size18.0 MiB
2025-01-08T17:46:18.050223image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length45
Median length45
Mean length45
Min length45

Characters and Unicode

Total characters106266285
Distinct characters22
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowurn:uuid:f14c21a9-8cbf-4c8b-817f-d19d427e2dd6
2nd rowurn:uuid:60e28f81-e634-4869-aa3e-732caed713c8
3rd rowurn:uuid:cc104cbf-fd8e-4801-9b71-36731a7db1a0
4th rowurn:uuid:f14c21a9-8cbf-4c8b-817f-d19d427e2dd6
5th rowurn:uuid:f14c21a9-8cbf-4c8b-817f-d19d427e2dd6
ValueCountFrequency (%)
urn:uuid:60e28f81-e634-4869-aa3e-732caed713c8 1149318
48.7%
urn:uuid:f14c21a9-8cbf-4c8b-817f-d19d427e2dd6 490281
20.8%
urn:uuid:18e3cd08-a962-4f0a-b72c-9a0b3600c5ad 154108
 
6.5%
urn:uuid:59e56a59-8615-4e0c-841d-eb88f3876b22 152955
 
6.5%
urn:uuid:73d83e23-1999-42cd-b38a-c06a7d32d893 149231
 
6.3%
urn:uuid:cc104cbf-fd8e-4801-9b71-36731a7db1a0 148897
 
6.3%
urn:uuid:09c9cf5f-f5d3-48cc-b5c8-cd9b9fbd631f 116683
 
4.9%
2025-01-08T17:46:18.162877image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 9445892
 
8.9%
8 8119959
 
7.6%
d 7177853
 
6.8%
u 7084419
 
6.7%
3 6484989
 
6.1%
e 5998654
 
5.6%
c 5830009
 
5.5%
1 5730177
 
5.4%
a 5303878
 
5.0%
6 5120127
 
4.8%
Other values (12) 39970328
37.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 47269123
44.5%
Lowercase Letter 44828324
42.2%
Dash Punctuation 9445892
 
8.9%
Other Punctuation 4722946
 
4.4%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
8 8119959
17.2%
3 6484989
13.7%
1 5730177
12.1%
6 5120127
10.8%
2 4831298
10.2%
4 4793205
10.1%
7 4331414
9.2%
9 3956559
8.4%
0 2785418
 
5.9%
5 1115977
 
2.4%
Lowercase Letter
ValueCountFrequency (%)
d 7177853
16.0%
u 7084419
15.8%
e 5998654
13.4%
c 5830009
13.0%
a 5303878
11.8%
f 3808433
8.5%
b 2540659
 
5.7%
r 2361473
 
5.3%
i 2361473
 
5.3%
n 2361473
 
5.3%
Dash Punctuation
ValueCountFrequency (%)
- 9445892
100.0%
Other Punctuation
ValueCountFrequency (%)
: 4722946
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 61437961
57.8%
Latin 44828324
42.2%

Most frequent character per script

Common
ValueCountFrequency (%)
- 9445892
15.4%
8 8119959
13.2%
3 6484989
10.6%
1 5730177
9.3%
6 5120127
8.3%
2 4831298
7.9%
4 4793205
7.8%
: 4722946
7.7%
7 4331414
7.1%
9 3956559
6.4%
Other values (2) 3901395
6.4%
Latin
ValueCountFrequency (%)
d 7177853
16.0%
u 7084419
15.8%
e 5998654
13.4%
c 5830009
13.0%
a 5303878
11.8%
f 3808433
8.5%
b 2540659
 
5.7%
r 2361473
 
5.3%
i 2361473
 
5.3%
n 2361473
 
5.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 106266285
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 9445892
 
8.9%
8 8119959
 
7.6%
d 7177853
 
6.8%
u 7084419
 
6.7%
3 6484989
 
6.1%
e 5998654
 
5.6%
c 5830009
 
5.5%
1 5730177
 
5.4%
a 5303878
 
5.0%
6 5120127
 
4.8%
Other values (12) 39970328
37.6%
Distinct37
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size18.0 MiB
2025-01-08T17:46:18.328264image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length6
Median length4
Mean length3.026425879
Min length2

Characters and Unicode

Total characters7146823
Distinct characters20
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8 ?
Unique (%)< 0.1%

Sample

1st rowUSNM
2nd rowUS
3rd rowUSNM
4th rowUSNM
5th rowUSNM
ValueCountFrequency (%)
usnm 1210999
51.3%
us 1149318
48.7%
nsmt 255
 
< 0.1%
uam 205
 
< 0.1%
rmnh 94
 
< 0.1%
nrm 92
 
< 0.1%
nmv 65
 
< 0.1%
rcs 61
 
< 0.1%
zmmu 46
 
< 0.1%
nmsz 44
 
< 0.1%
Other values (27) 294
 
< 0.1%
2025-01-08T17:46:18.449990image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
S 2360776
33.0%
U 2360623
33.0%
M 1212127
17.0%
N 1211680
17.0%
A 347
 
< 0.1%
T 255
 
< 0.1%
R 251
 
< 0.1%
H 151
 
< 0.1%
C 125
 
< 0.1%
Z 116
 
< 0.1%
Other values (10) 372
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 7146823
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S 2360776
33.0%
U 2360623
33.0%
M 1212127
17.0%
N 1211680
17.0%
A 347
 
< 0.1%
T 255
 
< 0.1%
R 251
 
< 0.1%
H 151
 
< 0.1%
C 125
 
< 0.1%
Z 116
 
< 0.1%
Other values (10) 372
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 7146823
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
S 2360776
33.0%
U 2360623
33.0%
M 1212127
17.0%
N 1211680
17.0%
A 347
 
< 0.1%
T 255
 
< 0.1%
R 251
 
< 0.1%
H 151
 
< 0.1%
C 125
 
< 0.1%
Z 116
 
< 0.1%
Other values (10) 372
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7146823
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S 2360776
33.0%
U 2360623
33.0%
M 1212127
17.0%
N 1211680
17.0%
A 347
 
< 0.1%
T 255
 
< 0.1%
R 251
 
< 0.1%
H 151
 
< 0.1%
C 125
 
< 0.1%
Z 116
 
< 0.1%
Other values (10) 372
 
< 0.1%
Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size18.0 MiB
2025-01-08T17:46:18.495990image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length5
Median length2
Mean length2.609310799
Min length2

Characters and Unicode

Total characters6161817
Distinct characters15
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowIZ
2nd rowUS
3rd rowHERP
4th rowIZ
5th rowIZ
ValueCountFrequency (%)
us 1149318
48.7%
iz 490281
20.8%
ent 154108
 
6.5%
mamm 152955
 
6.5%
birds 149231
 
6.3%
herp 148897
 
6.3%
fish 116683
 
4.9%
2025-01-08T17:46:18.599334image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
S 1415232
23.0%
U 1149318
18.7%
I 756195
12.3%
Z 490281
 
8.0%
M 458865
 
7.4%
E 303005
 
4.9%
R 298128
 
4.8%
H 265580
 
4.3%
N 154108
 
2.5%
T 154108
 
2.5%
Other values (5) 716997
11.6%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 6161817
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S 1415232
23.0%
U 1149318
18.7%
I 756195
12.3%
Z 490281
 
8.0%
M 458865
 
7.4%
E 303005
 
4.9%
R 298128
 
4.8%
H 265580
 
4.3%
N 154108
 
2.5%
T 154108
 
2.5%
Other values (5) 716997
11.6%

Most occurring scripts

ValueCountFrequency (%)
Latin 6161817
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
S 1415232
23.0%
U 1149318
18.7%
I 756195
12.3%
Z 490281
 
8.0%
M 458865
 
7.4%
E 303005
 
4.9%
R 298128
 
4.8%
H 265580
 
4.3%
N 154108
 
2.5%
T 154108
 
2.5%
Other values (5) 716997
11.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6161817
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S 1415232
23.0%
U 1149318
18.7%
I 756195
12.3%
Z 490281
 
8.0%
M 458865
 
7.4%
E 303005
 
4.9%
R 298128
 
4.8%
H 265580
 
4.3%
N 154108
 
2.5%
T 154108
 
2.5%
Other values (5) 716997
11.6%

datasetName
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size18.0 MiB
2025-01-08T17:46:18.641739image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length19
Median length19
Mean length19
Min length19

Characters and Unicode

Total characters44867987
Distinct characters15
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNMNH Extant Biology
2nd rowNMNH Extant Biology
3rd rowNMNH Extant Biology
4th rowNMNH Extant Biology
5th rowNMNH Extant Biology
ValueCountFrequency (%)
nmnh 2361473
33.3%
extant 2361473
33.3%
biology 2361473
33.3%
2025-01-08T17:46:18.737381image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
N 4722946
 
10.5%
4722946
 
10.5%
t 4722946
 
10.5%
o 4722946
 
10.5%
M 2361473
 
5.3%
H 2361473
 
5.3%
E 2361473
 
5.3%
x 2361473
 
5.3%
a 2361473
 
5.3%
n 2361473
 
5.3%
Other values (5) 11807365
26.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 25976203
57.9%
Uppercase Letter 14168838
31.6%
Space Separator 4722946
 
10.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 4722946
18.2%
o 4722946
18.2%
x 2361473
9.1%
a 2361473
9.1%
n 2361473
9.1%
i 2361473
9.1%
l 2361473
9.1%
g 2361473
9.1%
y 2361473
9.1%
Uppercase Letter
ValueCountFrequency (%)
N 4722946
33.3%
M 2361473
16.7%
H 2361473
16.7%
E 2361473
16.7%
B 2361473
16.7%
Space Separator
ValueCountFrequency (%)
4722946
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 40145041
89.5%
Common 4722946
 
10.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 4722946
11.8%
t 4722946
11.8%
o 4722946
11.8%
M 2361473
 
5.9%
H 2361473
 
5.9%
E 2361473
 
5.9%
x 2361473
 
5.9%
a 2361473
 
5.9%
n 2361473
 
5.9%
B 2361473
 
5.9%
Other values (4) 9445892
23.5%
Common
ValueCountFrequency (%)
4722946
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 44867987
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 4722946
 
10.5%
4722946
 
10.5%
t 4722946
 
10.5%
o 4722946
 
10.5%
M 2361473
 
5.3%
H 2361473
 
5.3%
E 2361473
 
5.3%
x 2361473
 
5.3%
a 2361473
 
5.3%
n 2361473
 
5.3%
Other values (5) 11807365
26.3%
Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size18.0 MiB
2025-01-08T17:46:18.786381image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length19
Median length18
Mean length18.00610509
Min length17

Characters and Unicode

Total characters42520931
Distinct characters17
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPRESERVED_SPECIMEN
2nd rowPRESERVED_SPECIMEN
3rd rowPRESERVED_SPECIMEN
4th rowPRESERVED_SPECIMEN
5th rowPRESERVED_SPECIMEN
ValueCountFrequency (%)
preserved_specimen 2329878
98.7%
machine_observation 23006
 
1.0%
human_observation 8589
 
0.4%
2025-01-08T17:46:18.895741image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
E 11703991
27.5%
R 4691351
11.0%
S 4691351
11.0%
P 4659756
 
11.0%
N 2393068
 
5.6%
I 2384479
 
5.6%
_ 2361473
 
5.6%
M 2361473
 
5.6%
V 2361473
 
5.6%
C 2352884
 
5.5%
Other values (7) 2559632
 
6.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 40159458
94.4%
Connector Punctuation 2361473
 
5.6%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 11703991
29.1%
R 4691351
11.7%
S 4691351
11.7%
P 4659756
 
11.6%
N 2393068
 
6.0%
I 2384479
 
5.9%
M 2361473
 
5.9%
V 2361473
 
5.9%
C 2352884
 
5.9%
D 2329878
 
5.8%
Other values (6) 229754
 
0.6%
Connector Punctuation
ValueCountFrequency (%)
_ 2361473
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 40159458
94.4%
Common 2361473
 
5.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 11703991
29.1%
R 4691351
11.7%
S 4691351
11.7%
P 4659756
 
11.6%
N 2393068
 
6.0%
I 2384479
 
5.9%
M 2361473
 
5.9%
V 2361473
 
5.9%
C 2352884
 
5.9%
D 2329878
 
5.8%
Other values (6) 229754
 
0.6%
Common
ValueCountFrequency (%)
_ 2361473
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 42520931
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E 11703991
27.5%
R 4691351
11.0%
S 4691351
11.0%
P 4659756
 
11.0%
N 2393068
 
5.6%
I 2384479
 
5.6%
_ 2361473
 
5.6%
M 2361473
 
5.6%
V 2361473
 
5.6%
C 2352884
 
5.5%
Other values (7) 2559632
 
6.0%

occurrenceID
Text

Unique 

Distinct2361473
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size18.0 MiB
2025-01-08T17:46:19.879215image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length63
Median length63
Mean length62.99999068
Min length41

Characters and Unicode

Total characters148772777
Distinct characters26
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2361473 ?
Unique (%)100.0%

Sample

1st rowhttp://n2t.net/ark:/65665/3c1d5cd1b-23f9-4aab-8cd8-011e6535be18
2nd rowhttp://n2t.net/ark:/65665/38212d138-cfcd-4363-8d3b-93b82afc1d4b
3rd rowhttp://n2t.net/ark:/65665/3c1d69371-acc7-4c47-bc57-9d5ba7994267
4th rowhttp://n2t.net/ark:/65665/382140f93-30c1-4f26-bd0c-77d197d5ebc0
5th rowhttp://n2t.net/ark:/65665/3c1d814f8-bb57-4c37-a953-dd84b1c6415d
ValueCountFrequency (%)
http://n2t.net/ark:/65665/3c1d5cd1b-23f9-4aab-8cd8-011e6535be18 1
 
< 0.1%
http://n2t.net/ark:/65665/382a04cb2-f704-42e5-bba7-b8a5c5cb730e 1
 
< 0.1%
http://n2t.net/ark:/65665/3821e29b7-fb9d-454b-bb15-3423f912baa1 1
 
< 0.1%
http://n2t.net/ark:/65665/3822c38fd-38fc-4edd-913d-06e17f9f83c5 1
 
< 0.1%
http://n2t.net/ark:/65665/3c1db6db1-1cf4-4831-a73a-6e62a01a92ec 1
 
< 0.1%
http://n2t.net/ark:/65665/3c1d69371-acc7-4c47-bc57-9d5ba7994267 1
 
< 0.1%
http://n2t.net/ark:/65665/382140f93-30c1-4f26-bd0c-77d197d5ebc0 1
 
< 0.1%
http://n2t.net/ark:/65665/3c1d814f8-bb57-4c37-a953-dd84b1c6415d 1
 
< 0.1%
http://n2t.net/ark:/65665/38215186e-af4f-46dc-8b81-ec58617bdfd7 1
 
< 0.1%
http://n2t.net/ark:/65665/3c1d9c4a8-7ba7-48dd-b92e-9924960b16d2 1
 
< 0.1%
Other values (2361463) 2361463
> 99.9%
2025-01-08T17:46:20.924970image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
/ 11807365
 
7.9%
6 11516266
 
7.7%
t 9445892
 
6.3%
- 9445890
 
6.3%
5 9149800
 
6.2%
a 7384461
 
5.0%
2 6789572
 
4.6%
3 6788565
 
4.6%
4 6786988
 
4.6%
e 6782983
 
4.6%
Other values (16) 62874995
42.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 64357740
43.3%
Lowercase Letter 56077363
37.7%
Other Punctuation 18891784
 
12.7%
Dash Punctuation 9445890
 
6.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 9445892
16.8%
a 7384461
13.2%
e 6782983
12.1%
b 5018118
8.9%
n 4722946
8.4%
c 4431867
7.9%
d 4426436
7.9%
f 4418768
7.9%
k 2361473
 
4.2%
r 2361473
 
4.2%
Other values (2) 4722946
8.4%
Decimal Number
ValueCountFrequency (%)
6 11516266
17.9%
5 9149800
14.2%
2 6789572
10.5%
3 6788565
10.5%
4 6786988
10.5%
9 5027535
7.8%
8 5024119
7.8%
7 4432331
 
6.9%
1 4425192
 
6.9%
0 4417372
 
6.9%
Other Punctuation
ValueCountFrequency (%)
/ 11807365
62.5%
: 4722946
 
25.0%
. 2361473
 
12.5%
Dash Punctuation
ValueCountFrequency (%)
- 9445890
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 92695414
62.3%
Latin 56077363
37.7%

Most frequent character per script

Common
ValueCountFrequency (%)
/ 11807365
12.7%
6 11516266
12.4%
- 9445890
10.2%
5 9149800
9.9%
2 6789572
7.3%
3 6788565
7.3%
4 6786988
7.3%
9 5027535
 
5.4%
8 5024119
 
5.4%
: 4722946
 
5.1%
Other values (4) 15636368
16.9%
Latin
ValueCountFrequency (%)
t 9445892
16.8%
a 7384461
13.2%
e 6782983
12.1%
b 5018118
8.9%
n 4722946
8.4%
c 4431867
7.9%
d 4426436
7.9%
f 4418768
7.9%
k 2361473
 
4.2%
r 2361473
 
4.2%
Other values (2) 4722946
8.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 148772777
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
/ 11807365
 
7.9%
6 11516266
 
7.7%
t 9445892
 
6.3%
- 9445890
 
6.3%
5 9149800
 
6.2%
a 7384461
 
5.0%
2 6789572
 
4.6%
3 6788565
 
4.6%
4 6786988
 
4.6%
e 6782983
 
4.6%
Other values (16) 62874995
42.3%

catalogNumber
Text

Missing 

Distinct1790648
Distinct (%)83.4%
Missing213212
Missing (%)9.0%
Memory size18.0 MiB
2025-01-08T17:46:21.808204image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length22
Median length21
Mean length10.54068477
Min length4

Characters and Unicode

Total characters22644142
Distinct characters67
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1533820 ?
Unique (%)71.4%

Sample

1st rowUSNM 1220020
2nd rowUS 2327562
3rd rowUSNM 359728
4th rowUSNM 65866
5th rowUSNM 1569732
ValueCountFrequency (%)
usnm 1056890
25.2%
us 984436
23.5%
herp 1447
 
< 0.1%
tissue 1416
 
< 0.1%
sem 65
 
< 0.1%
48
 
< 0.1%
1 41
 
< 0.1%
stub 40
 
< 0.1%
image 31
 
< 0.1%
micrograph 25
 
< 0.1%
Other values (1602736) 2148327
51.2%
2025-01-08T17:46:22.751002image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
S 2152376
 
9.5%
U 2147427
 
9.5%
2044505
 
9.0%
1 1805413
 
8.0%
2 1634081
 
7.2%
3 1537039
 
6.8%
0 1310826
 
5.8%
4 1307100
 
5.8%
5 1283378
 
5.7%
N 1249869
 
5.5%
Other values (57) 6172128
27.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 13582232
60.0%
Uppercase Letter 6983762
30.8%
Space Separator 2044505
 
9.0%
Lowercase Letter 25990
 
0.1%
Dash Punctuation 5780
 
< 0.1%
Other Punctuation 1853
 
< 0.1%
Close Punctuation 10
 
< 0.1%
Open Punctuation 10
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S 2152376
30.8%
U 2147427
30.7%
N 1249869
17.9%
M 1159926
16.6%
E 112070
 
1.6%
T 100478
 
1.4%
A 17145
 
0.2%
D 16761
 
0.2%
R 11166
 
0.2%
B 8984
 
0.1%
Other values (15) 7560
 
0.1%
Lowercase Letter
ValueCountFrequency (%)
w 11053
42.5%
e 2937
 
11.3%
s 2832
 
10.9%
a 2180
 
8.4%
r 1498
 
5.8%
p 1473
 
5.7%
u 1466
 
5.6%
i 1445
 
5.6%
b 476
 
1.8%
c 186
 
0.7%
Other values (15) 444
 
1.7%
Decimal Number
ValueCountFrequency (%)
1 1805413
13.3%
2 1634081
12.0%
3 1537039
11.3%
0 1310826
9.7%
4 1307100
9.6%
5 1283378
9.4%
6 1213835
8.9%
7 1186666
8.7%
8 1170155
8.6%
9 1133739
8.3%
Other Punctuation
ValueCountFrequency (%)
. 1056
57.0%
* 796
43.0%
? 1
 
0.1%
Space Separator
ValueCountFrequency (%)
2044505
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 5780
100.0%
Close Punctuation
ValueCountFrequency (%)
) 10
100.0%
Open Punctuation
ValueCountFrequency (%)
( 10
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 15634390
69.0%
Latin 7009752
31.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
S 2152376
30.7%
U 2147427
30.6%
N 1249869
17.8%
M 1159926
16.5%
E 112070
 
1.6%
T 100478
 
1.4%
A 17145
 
0.2%
D 16761
 
0.2%
R 11166
 
0.2%
w 11053
 
0.2%
Other values (40) 31481
 
0.4%
Common
ValueCountFrequency (%)
2044505
13.1%
1 1805413
11.5%
2 1634081
10.5%
3 1537039
9.8%
0 1310826
8.4%
4 1307100
8.4%
5 1283378
8.2%
6 1213835
7.8%
7 1186666
7.6%
8 1170155
7.5%
Other values (7) 1141392
7.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 22644142
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S 2152376
 
9.5%
U 2147427
 
9.5%
2044505
 
9.0%
1 1805413
 
8.0%
2 1634081
 
7.2%
3 1537039
 
6.8%
0 1310826
 
5.8%
4 1307100
 
5.8%
5 1283378
 
5.7%
N 1249869
 
5.5%
Other values (57) 6172128
27.3%

recordNumber
Text

Missing 

Distinct253149
Distinct (%)19.2%
Missing1045439
Missing (%)44.3%
Memory size18.0 MiB
2025-01-08T17:46:22.965140image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length93
Median length90
Mean length4.785216035
Min length1

Characters and Unicode

Total characters6297507
Distinct characters104
Distinct categories12 ?
Distinct scripts2 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique197322 ?
Unique (%)15.0%

Sample

1st row5209
2nd rowUSNPC # 008843
3rd rowUSNPC # 074963
4th row478
5th rows.n.
ValueCountFrequency (%)
s.n 164138
 
11.2%
26102
 
1.8%
usnpc 22710
 
1.5%
no 12214
 
0.8%
number 11997
 
0.8%
bureau 5232
 
0.4%
eyd 4047
 
0.3%
s 3600
 
0.2%
of 3507
 
0.2%
n 3489
 
0.2%
Other values (191948) 1214297
82.5%
2025-01-08T17:46:23.242674image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 716221
11.4%
2 556077
 
8.8%
3 479210
 
7.6%
0 459758
 
7.3%
4 448405
 
7.1%
5 430657
 
6.8%
6 416990
 
6.6%
7 393873
 
6.3%
8 377941
 
6.0%
9 367752
 
5.8%
Other values (94) 1650623
26.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4646884
73.8%
Lowercase Letter 543523
 
8.6%
Uppercase Letter 454883
 
7.2%
Other Punctuation 398023
 
6.3%
Space Separator 155299
 
2.5%
Dash Punctuation 89908
 
1.4%
Connector Punctuation 3813
 
0.1%
Close Punctuation 2297
 
< 0.1%
Open Punctuation 2296
 
< 0.1%
Other Number 408
 
< 0.1%
Other values (2) 173
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 175707
32.3%
s 172005
31.6%
e 30020
 
5.5%
u 24493
 
4.5%
r 24320
 
4.5%
o 21858
 
4.0%
a 19196
 
3.5%
b 18006
 
3.3%
m 13239
 
2.4%
c 10258
 
1.9%
Other values (23) 34421
 
6.3%
Uppercase Letter
ValueCountFrequency (%)
N 58959
13.0%
S 46028
 
10.1%
C 37900
 
8.3%
P 36208
 
8.0%
U 27138
 
6.0%
B 24770
 
5.4%
A 23606
 
5.2%
H 20043
 
4.4%
D 18973
 
4.2%
L 18356
 
4.0%
Other values (18) 142902
31.4%
Other Punctuation
ValueCountFrequency (%)
. 347774
87.4%
# 22903
 
5.8%
/ 12068
 
3.0%
& 6380
 
1.6%
* 3583
 
0.9%
? 2718
 
0.7%
, 1563
 
0.4%
! 632
 
0.2%
: 243
 
0.1%
; 104
 
< 0.1%
Other values (3) 55
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 716221
15.4%
2 556077
12.0%
3 479210
10.3%
0 459758
9.9%
4 448405
9.6%
5 430657
9.3%
6 416990
9.0%
7 393873
8.5%
8 377941
8.1%
9 367752
7.9%
Other Number
ValueCountFrequency (%)
½ 393
96.3%
² 6
 
1.5%
¼ 4
 
1.0%
³ 2
 
0.5%
¾ 2
 
0.5%
1
 
0.2%
Close Punctuation
ValueCountFrequency (%)
) 2122
92.4%
] 114
 
5.0%
} 61
 
2.7%
Open Punctuation
ValueCountFrequency (%)
( 2121
92.4%
[ 114
 
5.0%
{ 61
 
2.7%
Math Symbol
ValueCountFrequency (%)
= 130
75.6%
+ 40
 
23.3%
~ 2
 
1.2%
Dash Punctuation
ValueCountFrequency (%)
- 89907
> 99.9%
1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
155299
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 3813
100.0%
Other Symbol
ValueCountFrequency (%)
° 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 5299101
84.1%
Latin 998406
 
15.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 175707
17.6%
s 172005
17.2%
N 58959
 
5.9%
S 46028
 
4.6%
C 37900
 
3.8%
P 36208
 
3.6%
e 30020
 
3.0%
U 27138
 
2.7%
B 24770
 
2.5%
u 24493
 
2.5%
Other values (51) 365178
36.6%
Common
ValueCountFrequency (%)
1 716221
13.5%
2 556077
10.5%
3 479210
9.0%
0 459758
8.7%
4 448405
8.5%
5 430657
8.1%
6 416990
7.9%
7 393873
7.4%
8 377941
7.1%
9 367752
6.9%
Other values (33) 652217
12.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6297080
> 99.9%
None 425
 
< 0.1%
Punctuation 1
 
< 0.1%
Number Forms 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 716221
11.4%
2 556077
 
8.8%
3 479210
 
7.6%
0 459758
 
7.3%
4 448405
 
7.1%
5 430657
 
6.8%
6 416990
 
6.6%
7 393873
 
6.3%
8 377941
 
6.0%
9 367752
 
5.8%
Other values (77) 1650196
26.2%
None
ValueCountFrequency (%)
½ 393
92.5%
² 6
 
1.4%
è 5
 
1.2%
¼ 4
 
0.9%
é 3
 
0.7%
á 3
 
0.7%
³ 2
 
0.5%
¾ 2
 
0.5%
Ʃ 1
 
0.2%
ó 1
 
0.2%
Other values (5) 5
 
1.2%
Punctuation
ValueCountFrequency (%)
1
100.0%
Number Forms
ValueCountFrequency (%)
1
100.0%

recordedBy
Text

Missing 

Distinct115852
Distinct (%)6.2%
Missing498671
Missing (%)21.1%
Memory size18.0 MiB
2025-01-08T17:46:23.431343image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length239
Median length171
Mean length17.19955529
Min length1

Characters and Unicode

Total characters32039366
Distinct characters143
Distinct categories13 ?
Distinct scripts4 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique56989 ?
Unique (%)3.1%

Sample

1st rowG. Hendler
2nd rowR. C. Rollins & D. Rollins
3rd rowT. Vaughan
4th rowD. Harper
5th rowF. Harvey
ValueCountFrequency (%)
413362
 
6.3%
j 303894
 
4.6%
a 242831
 
3.7%
r 228710
 
3.5%
e 216487
 
3.3%
c 207629
 
3.2%
m 197349
 
3.0%
h 179143
 
2.7%
w 156602
 
2.4%
l 143933
 
2.2%
Other values (44536) 4249869
65.0%
2025-01-08T17:46:23.703892image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4677007
 
14.6%
. 2978690
 
9.3%
e 2217324
 
6.9%
a 1603560
 
5.0%
r 1548210
 
4.8%
n 1462460
 
4.6%
o 1456032
 
4.5%
i 1334932
 
4.2%
l 1157612
 
3.6%
t 1153395
 
3.6%
Other values (133) 12450144
38.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 17261749
53.9%
Uppercase Letter 6291902
 
19.6%
Space Separator 4677007
 
14.6%
Other Punctuation 3656578
 
11.4%
Dash Punctuation 103346
 
0.3%
Close Punctuation 21592
 
0.1%
Open Punctuation 21572
 
0.1%
Decimal Number 5575
 
< 0.1%
Math Symbol 30
 
< 0.1%
Modifier Symbol 9
 
< 0.1%
Other values (3) 6
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 2217324
12.8%
a 1603560
9.3%
r 1548210
9.0%
n 1462460
 
8.5%
o 1456032
 
8.4%
i 1334932
 
7.7%
l 1157612
 
6.7%
t 1153395
 
6.7%
s 1030523
 
6.0%
h 517272
 
3.0%
Other values (59) 3780429
21.9%
Uppercase Letter
ValueCountFrequency (%)
M 576187
 
9.2%
S 551297
 
8.8%
C 480161
 
7.6%
R 395841
 
6.3%
H 394114
 
6.3%
B 381832
 
6.1%
J 365707
 
5.8%
A 357987
 
5.7%
L 337942
 
5.4%
W 302701
 
4.8%
Other values (31) 2148133
34.1%
Other Punctuation
ValueCountFrequency (%)
. 2978690
81.5%
& 368061
 
10.1%
, 243814
 
6.7%
/ 61528
 
1.7%
' 4009
 
0.1%
" 424
 
< 0.1%
? 27
 
< 0.1%
: 17
 
< 0.1%
; 5
 
< 0.1%
# 2
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 1357
24.3%
9 1116
20.0%
8 933
16.7%
0 693
12.4%
3 373
 
6.7%
4 368
 
6.6%
2 338
 
6.1%
5 306
 
5.5%
7 54
 
1.0%
6 37
 
0.7%
Open Punctuation
ValueCountFrequency (%)
[ 16664
77.2%
( 4908
 
22.8%
Close Punctuation
ValueCountFrequency (%)
] 16662
77.2%
) 4930
 
22.8%
Math Symbol
ValueCountFrequency (%)
= 26
86.7%
+ 4
 
13.3%
Space Separator
ValueCountFrequency (%)
4677007
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 103346
100.0%
Modifier Symbol
ValueCountFrequency (%)
´ 9
100.0%
Other Symbol
ValueCountFrequency (%)
° 4
100.0%
Final Punctuation
ValueCountFrequency (%)
» 1
100.0%
Initial Punctuation
ValueCountFrequency (%)
« 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 23553649
73.5%
Common 8485715
 
26.5%
Cyrillic 1
 
< 0.1%
Greek 1
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 2217324
 
9.4%
a 1603560
 
6.8%
r 1548210
 
6.6%
n 1462460
 
6.2%
o 1456032
 
6.2%
i 1334932
 
5.7%
l 1157612
 
4.9%
t 1153395
 
4.9%
s 1030523
 
4.4%
M 576187
 
2.4%
Other values (98) 10013414
42.5%
Common
ValueCountFrequency (%)
4677007
55.1%
. 2978690
35.1%
& 368061
 
4.3%
, 243814
 
2.9%
- 103346
 
1.2%
/ 61528
 
0.7%
[ 16664
 
0.2%
] 16662
 
0.2%
) 4930
 
0.1%
( 4908
 
0.1%
Other values (23) 10105
 
0.1%
Cyrillic
ValueCountFrequency (%)
Ӧ 1
100.0%
Greek
ValueCountFrequency (%)
β 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 31975557
99.8%
None 63807
 
0.2%
IPA Ext 1
 
< 0.1%
Cyrillic 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4677007
 
14.6%
. 2978690
 
9.3%
e 2217324
 
6.9%
a 1603560
 
5.0%
r 1548210
 
4.8%
n 1462460
 
4.6%
o 1456032
 
4.6%
i 1334932
 
4.2%
l 1157612
 
3.6%
t 1153395
 
3.6%
Other values (70) 12386335
38.7%
None
ValueCountFrequency (%)
é 10928
17.1%
á 10904
17.1%
ó 9861
15.5%
í 7315
11.5%
ñ 6341
9.9%
è 4437
7.0%
ü 3492
 
5.5%
ö 2715
 
4.3%
ê 1774
 
2.8%
ç 790
 
1.2%
Other values (51) 5250
8.2%
IPA Ext
ValueCountFrequency (%)
ɶ 1
100.0%
Cyrillic
ValueCountFrequency (%)
Ӧ 1
100.0%
Distinct819
Distinct (%)< 0.1%
Missing1023
Missing (%)< 0.1%
Memory size18.0 MiB
2025-01-08T17:46:23.859453image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length5
Median length1
Mean length1.031862145
Min length1

Characters and Unicode

Total characters2435659
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique338 ?
Unique (%)< 0.1%

Sample

1st row31
2nd row1
3rd row1
4th row4
5th row1
ValueCountFrequency (%)
1 2047199
86.7%
2 94185
 
4.0%
3 46375
 
2.0%
4 33192
 
1.4%
5 24016
 
1.0%
6 16667
 
0.7%
10 12127
 
0.5%
7 10535
 
0.4%
8 9696
 
0.4%
9 6408
 
0.3%
Other values (809) 60050
 
2.5%
2025-01-08T17:46:24.064303image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 2094601
86.0%
2 114742
 
4.7%
3 56960
 
2.3%
4 41191
 
1.7%
5 36721
 
1.5%
0 30807
 
1.3%
6 21946
 
0.9%
7 15204
 
0.6%
8 13757
 
0.6%
9 9730
 
0.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2435659
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 2094601
86.0%
2 114742
 
4.7%
3 56960
 
2.3%
4 41191
 
1.7%
5 36721
 
1.5%
0 30807
 
1.3%
6 21946
 
0.9%
7 15204
 
0.6%
8 13757
 
0.6%
9 9730
 
0.4%

Most occurring scripts

ValueCountFrequency (%)
Common 2435659
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 2094601
86.0%
2 114742
 
4.7%
3 56960
 
2.3%
4 41191
 
1.7%
5 36721
 
1.5%
0 30807
 
1.3%
6 21946
 
0.9%
7 15204
 
0.6%
8 13757
 
0.6%
9 9730
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2435659
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 2094601
86.0%
2 114742
 
4.7%
3 56960
 
2.3%
4 41191
 
1.7%
5 36721
 
1.5%
0 30807
 
1.3%
6 21946
 
0.9%
7 15204
 
0.6%
8 13757
 
0.6%
9 9730
 
0.4%

sex
Text

Missing 

Distinct3
Distinct (%)< 0.1%
Missing2009611
Missing (%)85.1%
Memory size18.0 MiB
2025-01-08T17:46:24.107304image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length4
Mean length4.89924459
Min length4

Characters and Unicode

Total characters1723858
Distinct characters12
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowFEMALE
2nd rowMALE
3rd rowMALE
4th rowMALE
5th rowMALE
ValueCountFrequency (%)
male 193937
55.1%
female 157845
44.9%
hermaphrodite 80
 
< 0.1%
2025-01-08T17:46:24.200572image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
E 509787
29.6%
M 351862
20.4%
A 351862
20.4%
L 351782
20.4%
F 157845
 
9.2%
H 160
 
< 0.1%
R 160
 
< 0.1%
P 80
 
< 0.1%
O 80
 
< 0.1%
D 80
 
< 0.1%
Other values (2) 160
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1723858
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 509787
29.6%
M 351862
20.4%
A 351862
20.4%
L 351782
20.4%
F 157845
 
9.2%
H 160
 
< 0.1%
R 160
 
< 0.1%
P 80
 
< 0.1%
O 80
 
< 0.1%
D 80
 
< 0.1%
Other values (2) 160
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 1723858
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 509787
29.6%
M 351862
20.4%
A 351862
20.4%
L 351782
20.4%
F 157845
 
9.2%
H 160
 
< 0.1%
R 160
 
< 0.1%
P 80
 
< 0.1%
O 80
 
< 0.1%
D 80
 
< 0.1%
Other values (2) 160
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1723858
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E 509787
29.6%
M 351862
20.4%
A 351862
20.4%
L 351782
20.4%
F 157845
 
9.2%
H 160
 
< 0.1%
R 160
 
< 0.1%
P 80
 
< 0.1%
O 80
 
< 0.1%
D 80
 
< 0.1%
Other values (2) 160
 
< 0.1%

lifeStage
Text

Missing 

Distinct30
Distinct (%)< 0.1%
Missing2107148
Missing (%)89.2%
Memory size18.0 MiB
2025-01-08T17:46:24.251851image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length5
Mean length6.528029097
Min length3

Characters and Unicode

Total characters1660241
Distinct characters39
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st rowAdult
2nd rowAdult
3rd rowAdult
4th rowFruiting
5th rowFlowering
ValueCountFrequency (%)
adult 137714
54.1%
flowering 49923
 
19.6%
fruiting 26364
 
10.4%
juvenile 15425
 
6.1%
immature 8914
 
3.5%
vegetative 6008
 
2.4%
larva 5218
 
2.1%
subadult 1137
 
0.4%
chick 960
 
0.4%
embryo 589
 
0.2%
Other values (20) 2073
 
0.8%
2025-01-08T17:46:24.363508image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
l 204968
12.3%
u 191207
11.5%
t 187477
11.3%
d 138859
8.4%
A 137714
8.3%
i 125782
7.6%
e 108742
 
6.5%
n 93053
 
5.6%
r 91120
 
5.5%
g 83572
 
5.0%
Other values (29) 297747
17.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1405916
84.7%
Uppercase Letter 254325
 
15.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
l 204968
14.6%
u 191207
13.6%
t 187477
13.3%
d 138859
9.9%
i 125782
8.9%
e 108742
7.7%
n 93053
6.6%
r 91120
6.5%
g 83572
5.9%
o 50963
 
3.6%
Other values (12) 130173
9.3%
Uppercase Letter
ValueCountFrequency (%)
A 137714
54.1%
F 76458
30.1%
J 15425
 
6.1%
I 8914
 
3.5%
V 6031
 
2.4%
L 5218
 
2.1%
S 1138
 
0.4%
C 963
 
0.4%
E 950
 
0.4%
H 575
 
0.2%
Other values (7) 939
 
0.4%

Most occurring scripts

ValueCountFrequency (%)
Latin 1660241
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
l 204968
12.3%
u 191207
11.5%
t 187477
11.3%
d 138859
8.4%
A 137714
8.3%
i 125782
7.6%
e 108742
 
6.5%
n 93053
 
5.6%
r 91120
 
5.5%
g 83572
 
5.0%
Other values (29) 297747
17.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1660241
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
l 204968
12.3%
u 191207
11.5%
t 187477
11.3%
d 138859
8.4%
A 137714
8.3%
i 125782
7.6%
e 108742
 
6.5%
n 93053
 
5.6%
r 91120
 
5.5%
g 83572
 
5.0%
Other values (29) 297747
17.9%
Distinct2
Distinct (%)< 0.1%
Missing1
Missing (%)< 0.1%
Memory size18.0 MiB
2025-01-08T17:46:24.407507image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length6.999555786
Min length6

Characters and Unicode

Total characters16529255
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPRESENT
2nd rowPRESENT
3rd rowPRESENT
4th rowPRESENT
5th rowPRESENT
ValueCountFrequency (%)
present 2360423
> 99.9%
absent 1049
 
< 0.1%
2025-01-08T17:46:24.498518image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
E 4721895
28.6%
S 2361472
14.3%
N 2361472
14.3%
T 2361472
14.3%
P 2360423
14.3%
R 2360423
14.3%
A 1049
 
< 0.1%
B 1049
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 16529255
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 4721895
28.6%
S 2361472
14.3%
N 2361472
14.3%
T 2361472
14.3%
P 2360423
14.3%
R 2360423
14.3%
A 1049
 
< 0.1%
B 1049
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 16529255
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 4721895
28.6%
S 2361472
14.3%
N 2361472
14.3%
T 2361472
14.3%
P 2360423
14.3%
R 2360423
14.3%
A 1049
 
< 0.1%
B 1049
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 16529255
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E 4721895
28.6%
S 2361472
14.3%
N 2361472
14.3%
T 2361472
14.3%
P 2360423
14.3%
R 2360423
14.3%
A 1049
 
< 0.1%
B 1049
 
< 0.1%

preparations
Text

Missing 

Distinct1125
Distinct (%)0.1%
Missing1223408
Missing (%)51.8%
Memory size18.0 MiB
2025-01-08T17:46:24.651314image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length192
Median length154
Mean length9.646168716
Min length3

Characters and Unicode

Total characters10977967
Distinct characters74
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique452 ?
Unique (%)< 0.1%

Sample

1st rowAlcohol (Ethanol)
2nd rowEthanol
3rd rowDry
4th rowAlcohol (Ethanol)
5th rowPinned
ValueCountFrequency (%)
ethanol 373474
21.8%
dry 234832
13.7%
alcohol 228646
13.3%
skin 213511
12.5%
whole 136729
 
8.0%
skull 114952
 
6.7%
pinned 99259
 
5.8%
slide 49718
 
2.9%
fluid 34184
 
2.0%
envelope 29335
 
1.7%
Other values (239) 199700
11.6%
2025-01-08T17:46:24.888760image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
l 1411530
 
12.9%
o 1133314
 
10.3%
n 892191
 
8.1%
h 772057
 
7.0%
576275
 
5.2%
i 492145
 
4.5%
e 477125
 
4.3%
a 473893
 
4.3%
t 460115
 
4.2%
S 424074
 
3.9%
Other values (64) 3865248
35.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 7927169
72.2%
Uppercase Letter 1683886
 
15.3%
Space Separator 576275
 
5.2%
Other Punctuation 285737
 
2.6%
Open Punctuation 244959
 
2.2%
Close Punctuation 244959
 
2.2%
Decimal Number 10059
 
0.1%
Dash Punctuation 4923
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
l 1411530
17.8%
o 1133314
14.3%
n 892191
11.3%
h 772057
9.7%
i 492145
 
6.2%
e 477125
 
6.0%
a 473893
 
6.0%
t 460115
 
5.8%
k 359224
 
4.5%
r 316217
 
4.0%
Other values (16) 1139358
14.4%
Uppercase Letter
ValueCountFrequency (%)
S 424074
25.2%
E 414250
24.6%
D 235606
14.0%
A 232250
13.8%
W 147923
 
8.8%
P 120049
 
7.1%
F 44864
 
2.7%
M 15691
 
0.9%
B 7803
 
0.5%
L 6693
 
0.4%
Other values (15) 34683
 
2.1%
Decimal Number
ValueCountFrequency (%)
9 4647
46.2%
5 4573
45.5%
0 456
 
4.5%
8 255
 
2.5%
7 107
 
1.1%
2 9
 
0.1%
1 9
 
0.1%
3 2
 
< 0.1%
6 1
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
: 143613
50.3%
; 135363
47.4%
% 5018
 
1.8%
& 836
 
0.3%
/ 806
 
0.3%
. 62
 
< 0.1%
, 37
 
< 0.1%
? 2
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
( 243972
99.6%
[ 987
 
0.4%
Close Punctuation
ValueCountFrequency (%)
) 243972
99.6%
] 987
 
0.4%
Space Separator
ValueCountFrequency (%)
576275
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 4923
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 9611055
87.5%
Common 1366912
 
12.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
l 1411530
14.7%
o 1133314
11.8%
n 892191
 
9.3%
h 772057
 
8.0%
i 492145
 
5.1%
e 477125
 
5.0%
a 473893
 
4.9%
t 460115
 
4.8%
S 424074
 
4.4%
E 414250
 
4.3%
Other values (41) 2660361
27.7%
Common
ValueCountFrequency (%)
576275
42.2%
( 243972
17.8%
) 243972
17.8%
: 143613
 
10.5%
; 135363
 
9.9%
% 5018
 
0.4%
- 4923
 
0.4%
9 4647
 
0.3%
5 4573
 
0.3%
] 987
 
0.1%
Other values (13) 3569
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10977967
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
l 1411530
 
12.9%
o 1133314
 
10.3%
n 892191
 
8.1%
h 772057
 
7.0%
576275
 
5.2%
i 492145
 
4.5%
e 477125
 
4.3%
a 473893
 
4.3%
t 460115
 
4.2%
S 424074
 
3.9%
Other values (64) 3865248
35.2%

associatedSequences
Text

Missing 

Distinct3083
Distinct (%)99.4%
Missing2358372
Missing (%)99.9%
Memory size18.0 MiB
2025-01-08T17:46:24.958581image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length12558
Median length49
Mean length106.4991938
Min length47

Characters and Unicode

Total characters330254
Distinct characters63
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3075 ?
Unique (%)99.2%

Sample

1st rowhttps://www.ncbi.nlm.nih.gov/gquery?term=KM080038
2nd rowhttps://www.ncbi.nlm.nih.gov/gquery?term=EU823242;https://www.ncbi.nlm.nih.gov/gquery?term=EU823167;https://www.ncbi.nlm.nih.gov/gquery?term=KC246618
3rd rowhttps://www.ncbi.nlm.nih.gov/gquery?term=MN549733
4th rowhttps://www.ncbi.nlm.nih.gov/gquery?term=KC771789;https://www.ncbi.nlm.nih.gov/gquery?term=KC771632
5th rowhttps://www.ncbi.nlm.nih.gov/gquery?term=HQ600894
ValueCountFrequency (%)
https://www.ncbi.nlm.nih.gov/gquery?term=prjna521985 8
 
0.3%
https://www.ncbi.nlm.nih.gov/gquery?term=km521547 5
 
0.2%
https://www.ncbi.nlm.nih.gov/gquery?term=ay273864 3
 
0.1%
https://www.ncbi.nlm.nih.gov/gquery?term=kf989555;https://www.ncbi.nlm.nih.gov/gquery?term=kf989872;https://www.ncbi.nlm.nih.gov/gquery?term=kf989774;https://www.ncbi.nlm.nih.gov/gquery?term=kf989974;https://www.ncbi.nlm.nih.gov/gquery?term=kf989663 2
 
0.1%
https://www.ncbi.nlm.nih.gov/gquery?term=kp739770 2
 
0.1%
https://www.ncbi.nlm.nih.gov/gquery?term=ay273835 2
 
0.1%
https://www.ncbi.nlm.nih.gov/gquery?term=mh244118 2
 
0.1%
https://www.ncbi.nlm.nih.gov/gquery?term=jn837192;https://www.ncbi.nlm.nih.gov/gquery?term=jn837282;https://www.ncbi.nlm.nih.gov/gquery?term=jn837372;https://www.ncbi.nlm.nih.gov/gquery?term=jn837475 2
 
0.1%
https://www.ncbi.nlm.nih.gov/gquery?term=kc771789;https://www.ncbi.nlm.nih.gov/gquery?term=kc771632 1
 
< 0.1%
https://www.ncbi.nlm.nih.gov/gquery?term=mw203870;https://www.ncbi.nlm.nih.gov/gquery?term=mw124994 1
 
< 0.1%
Other values (3073) 3073
99.1%
2025-01-08T17:46:25.087040image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 26569
 
8.0%
t 19917
 
6.0%
/ 19917
 
6.0%
w 19917
 
6.0%
n 19917
 
6.0%
i 13278
 
4.0%
r 13278
 
4.0%
e 13278
 
4.0%
m 13278
 
4.0%
h 13278
 
4.0%
Other values (53) 157627
47.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 205809
62.3%
Other Punctuation 63302
 
19.2%
Decimal Number 40399
 
12.2%
Uppercase Letter 13972
 
4.2%
Math Symbol 6639
 
2.0%
Dash Punctuation 132
 
< 0.1%
Connector Punctuation 1
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
K 2560
18.3%
M 1633
11.7%
J 1366
9.8%
U 1146
 
8.2%
Q 975
 
7.0%
F 824
 
5.9%
E 586
 
4.2%
R 568
 
4.1%
T 523
 
3.7%
N 470
 
3.4%
Other values (16) 3321
23.8%
Lowercase Letter
ValueCountFrequency (%)
t 19917
 
9.7%
w 19917
 
9.7%
n 19917
 
9.7%
i 13278
 
6.5%
r 13278
 
6.5%
e 13278
 
6.5%
m 13278
 
6.5%
h 13278
 
6.5%
g 13278
 
6.5%
u 6639
 
3.2%
Other values (9) 59751
29.0%
Decimal Number
ValueCountFrequency (%)
7 4742
11.7%
2 4353
10.8%
4 4156
10.3%
9 4080
10.1%
8 4030
10.0%
1 3949
9.8%
6 3797
9.4%
0 3796
9.4%
3 3770
9.3%
5 3726
9.2%
Other Punctuation
ValueCountFrequency (%)
. 26569
42.0%
/ 19917
31.5%
? 6639
 
10.5%
: 6639
 
10.5%
; 3538
 
5.6%
Math Symbol
ValueCountFrequency (%)
= 6639
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 132
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 219781
66.5%
Common 110473
33.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 19917
 
9.1%
w 19917
 
9.1%
n 19917
 
9.1%
i 13278
 
6.0%
r 13278
 
6.0%
e 13278
 
6.0%
m 13278
 
6.0%
h 13278
 
6.0%
g 13278
 
6.0%
u 6639
 
3.0%
Other values (35) 73723
33.5%
Common
ValueCountFrequency (%)
. 26569
24.1%
/ 19917
18.0%
= 6639
 
6.0%
? 6639
 
6.0%
: 6639
 
6.0%
7 4742
 
4.3%
2 4353
 
3.9%
4 4156
 
3.8%
9 4080
 
3.7%
8 4030
 
3.6%
Other values (8) 22709
20.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 330254
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 26569
 
8.0%
t 19917
 
6.0%
/ 19917
 
6.0%
w 19917
 
6.0%
n 19917
 
6.0%
i 13278
 
4.0%
r 13278
 
4.0%
e 13278
 
4.0%
m 13278
 
4.0%
h 13278
 
4.0%
Other values (53) 157627
47.7%

occurrenceRemarks
Text

Missing 

Distinct167099
Distinct (%)53.2%
Missing2047572
Missing (%)86.7%
Memory size18.0 MiB
2025-01-08T17:46:25.305766image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length197629
Median length2471
Mean length67.07117849
Min length1

Characters and Unicode

Total characters21053710
Distinct characters164
Distinct categories20 ?
Distinct scripts4 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique146202 ?
Unique (%)46.6%

Sample

1st rowNinoe sp. B
2nd row{"hostGen":"Wallago","hostSpec":"after","hostBodyLoc":"stomach"}; Original USNPC preservative was a solution of 70% ethanol, 3% formalin, and 2% glycerine
3rd row{"hostGen":"Catoptrophorus","hostSpec":"semipalmatus","hostBodyLoc":"esophagus","hostFldNo":"JEBadley-426-23"}; Glycerin jelly
4th rowScripps Institution of Oceanography library archives about M.J. Johnson Phyllosoma Collection: specimens were stained with fast green and are mounted mostly in Canada balsam, Harleco synthetic resin or diatex.
5th row8/28/28; 6527; Orcutt; Chamberlain Coll
ValueCountFrequency (%)
of 64177
 
2.1%
by 48564
 
1.6%
and 45626
 
1.5%
the 43989
 
1.4%
coll 38399
 
1.3%
34601
 
1.1%
a 34537
 
1.1%
to 31161
 
1.0%
was 27077
 
0.9%
in 26228
 
0.9%
Other values (150526) 2642394
87.0%
2025-01-08T17:46:25.622972image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2687843
 
12.8%
e 1443748
 
6.9%
o 1130071
 
5.4%
a 1127981
 
5.4%
i 1022042
 
4.9%
t 975063
 
4.6%
n 951404
 
4.5%
r 864162
 
4.1%
s 821052
 
3.9%
l 811731
 
3.9%
Other values (154) 9218613
43.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 12861737
61.1%
Space Separator 2687843
 
12.8%
Uppercase Letter 2138640
 
10.2%
Other Punctuation 1606079
 
7.6%
Decimal Number 1358982
 
6.5%
Control 113389
 
0.5%
Dash Punctuation 109232
 
0.5%
Open Punctuation 76596
 
0.4%
Close Punctuation 76560
 
0.4%
Math Symbol 14337
 
0.1%
Other values (10) 10315
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 1443748
11.2%
o 1130071
 
8.8%
a 1127981
 
8.8%
i 1022042
 
7.9%
t 975063
 
7.6%
n 951404
 
7.4%
r 864162
 
6.7%
s 821052
 
6.4%
l 811731
 
6.3%
d 545995
 
4.2%
Other values (49) 3168488
24.6%
Uppercase Letter
ValueCountFrequency (%)
S 247477
 
11.6%
C 222887
 
10.4%
P 130396
 
6.1%
N 115110
 
5.4%
B 112916
 
5.3%
M 112131
 
5.2%
F 107958
 
5.0%
T 102498
 
4.8%
A 96139
 
4.5%
L 90325
 
4.2%
Other values (24) 800803
37.4%
Other Punctuation
ValueCountFrequency (%)
. 474207
29.5%
" 320322
19.9%
; 300035
18.7%
, 209967
13.1%
: 172539
 
10.7%
% 43338
 
2.7%
/ 32232
 
2.0%
! 16738
 
1.0%
' 13401
 
0.8%
# 11044
 
0.7%
Other values (8) 12256
 
0.8%
Decimal Number
ValueCountFrequency (%)
1 265876
19.6%
2 181718
13.4%
0 152398
11.2%
9 146413
10.8%
3 119730
8.8%
7 107513
7.9%
5 101445
 
7.5%
4 98758
 
7.3%
6 97441
 
7.2%
8 87690
 
6.5%
Math Symbol
ValueCountFrequency (%)
= 7530
52.5%
+ 3523
24.6%
| 3144
21.9%
~ 52
 
0.4%
> 48
 
0.3%
< 24
 
0.2%
× 13
 
0.1%
± 3
 
< 0.1%
Other Symbol
ValueCountFrequency (%)
° 948
94.9%
29
 
2.9%
11
 
1.1%
© 6
 
0.6%
5
 
0.5%
Other Number
ValueCountFrequency (%)
½ 5
62.5%
³ 1
 
12.5%
¼ 1
 
12.5%
¹ 1
 
12.5%
Dash Punctuation
ValueCountFrequency (%)
- 108790
99.6%
435
 
0.4%
7
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
( 51244
66.9%
{ 22611
29.5%
[ 2741
 
3.6%
Close Punctuation
ValueCountFrequency (%)
) 51215
66.9%
} 22607
29.5%
] 2738
 
3.6%
Final Punctuation
ValueCountFrequency (%)
118
95.9%
4
 
3.3%
» 1
 
0.8%
Nonspacing Mark
ValueCountFrequency (%)
́ 93
60.0%
̧ 31
 
20.0%
̀ 31
 
20.0%
Control
ValueCountFrequency (%)
112881
99.6%
508
 
0.4%
Initial Punctuation
ValueCountFrequency (%)
107
99.1%
« 1
 
0.9%
Modifier Symbol
ValueCountFrequency (%)
^ 5
83.3%
´ 1
 
16.7%
Space Separator
ValueCountFrequency (%)
2687843
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 8532
100.0%
Other Letter
ValueCountFrequency (%)
º 288
100.0%
Currency Symbol
ValueCountFrequency (%)
$ 95
100.0%
Format
ValueCountFrequency (%)
 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 15000628
71.2%
Common 6052895
28.7%
Inherited 155
 
< 0.1%
Greek 32
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 1443748
 
9.6%
o 1130071
 
7.5%
a 1127981
 
7.5%
i 1022042
 
6.8%
t 975063
 
6.5%
n 951404
 
6.3%
r 864162
 
5.8%
s 821052
 
5.5%
l 811731
 
5.4%
d 545995
 
3.6%
Other values (82) 5307379
35.4%
Common
ValueCountFrequency (%)
2687843
44.4%
. 474207
 
7.8%
" 320322
 
5.3%
; 300035
 
5.0%
1 265876
 
4.4%
, 209967
 
3.5%
2 181718
 
3.0%
: 172539
 
2.9%
0 152398
 
2.5%
9 146413
 
2.4%
Other values (58) 1141577
18.9%
Inherited
ValueCountFrequency (%)
́ 93
60.0%
̧ 31
 
20.0%
̀ 31
 
20.0%
Greek
ValueCountFrequency (%)
μ 32
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 21050335
> 99.9%
None 2455
 
< 0.1%
Punctuation 720
 
< 0.1%
Diacriticals 155
 
< 0.1%
Misc Symbols 45
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2687843
 
12.8%
e 1443748
 
6.9%
o 1130071
 
5.4%
a 1127981
 
5.4%
i 1022042
 
4.9%
t 975063
 
4.6%
n 951404
 
4.5%
r 864162
 
4.1%
s 821052
 
3.9%
l 811731
 
3.9%
Other values (86) 9215238
43.8%
None
ValueCountFrequency (%)
° 948
38.6%
é 319
 
13.0%
º 288
 
11.7%
í 224
 
9.1%
ñ 95
 
3.9%
á 80
 
3.3%
· 67
 
2.7%
è 48
 
2.0%
ü 45
 
1.8%
ã 44
 
1.8%
Other values (45) 297
 
12.1%
Punctuation
ValueCountFrequency (%)
435
60.4%
118
 
16.4%
107
 
14.9%
41
 
5.7%
8
 
1.1%
7
 
1.0%
4
 
0.6%
Diacriticals
ValueCountFrequency (%)
́ 93
60.0%
̧ 31
 
20.0%
̀ 31
 
20.0%
Misc Symbols
ValueCountFrequency (%)
29
64.4%
11
 
24.4%
5
 
11.1%

verbatimLabel
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing2361471
Missing (%)> 99.9%
Memory size18.0 MiB
2025-01-08T17:46:25.683582image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length45
Median length26.5
Mean length26.5
Min length8

Characters and Unicode

Total characters53
Distinct characters30
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st row-15.6527
2nd rowNorth America, Canada, Nunavut, Baffin Island
ValueCountFrequency (%)
15.6527 1
14.3%
north 1
14.3%
america 1
14.3%
canada 1
14.3%
nunavut 1
14.3%
baffin 1
14.3%
island 1
14.3%
2025-01-08T17:46:25.794260image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 7
 
13.2%
5
 
9.4%
n 4
 
7.5%
, 3
 
5.7%
i 2
 
3.8%
d 2
 
3.8%
u 2
 
3.8%
t 2
 
3.8%
r 2
 
3.8%
N 2
 
3.8%
Other values (20) 22
41.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 31
58.5%
Uppercase Letter 6
 
11.3%
Decimal Number 6
 
11.3%
Space Separator 5
 
9.4%
Other Punctuation 4
 
7.5%
Dash Punctuation 1
 
1.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 7
22.6%
n 4
12.9%
i 2
 
6.5%
d 2
 
6.5%
u 2
 
6.5%
t 2
 
6.5%
r 2
 
6.5%
f 2
 
6.5%
v 1
 
3.2%
s 1
 
3.2%
Other values (6) 6
19.4%
Uppercase Letter
ValueCountFrequency (%)
N 2
33.3%
B 1
16.7%
I 1
16.7%
C 1
16.7%
A 1
16.7%
Decimal Number
ValueCountFrequency (%)
5 2
33.3%
1 1
16.7%
7 1
16.7%
2 1
16.7%
6 1
16.7%
Other Punctuation
ValueCountFrequency (%)
, 3
75.0%
. 1
 
25.0%
Space Separator
ValueCountFrequency (%)
5
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 37
69.8%
Common 16
30.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 7
18.9%
n 4
 
10.8%
i 2
 
5.4%
d 2
 
5.4%
u 2
 
5.4%
t 2
 
5.4%
r 2
 
5.4%
N 2
 
5.4%
f 2
 
5.4%
v 1
 
2.7%
Other values (11) 11
29.7%
Common
ValueCountFrequency (%)
5
31.2%
, 3
18.8%
5 2
 
12.5%
- 1
 
6.2%
1 1
 
6.2%
7 1
 
6.2%
2 1
 
6.2%
6 1
 
6.2%
. 1
 
6.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 53
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 7
 
13.2%
5
 
9.4%
n 4
 
7.5%
, 3
 
5.7%
i 2
 
3.8%
d 2
 
3.8%
u 2
 
3.8%
t 2
 
3.8%
r 2
 
3.8%
N 2
 
3.8%
Other values (20) 22
41.5%

materialSampleID
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing2361471
Missing (%)> 99.9%
Memory size18.0 MiB
2025-01-08T17:46:25.839765image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length10
Mean length10
Min length7

Characters and Unicode

Total characters20
Distinct characters16
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st row135.777
2nd rowNORTH_AMERICA
ValueCountFrequency (%)
135.777 1
50.0%
north_america 1
50.0%
2025-01-08T17:46:25.936198image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
7 3
15.0%
R 2
 
10.0%
A 2
 
10.0%
1 1
 
5.0%
3 1
 
5.0%
5 1
 
5.0%
. 1
 
5.0%
N 1
 
5.0%
O 1
 
5.0%
T 1
 
5.0%
Other values (6) 6
30.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 12
60.0%
Decimal Number 6
30.0%
Other Punctuation 1
 
5.0%
Connector Punctuation 1
 
5.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
R 2
16.7%
A 2
16.7%
N 1
8.3%
O 1
8.3%
T 1
8.3%
H 1
8.3%
M 1
8.3%
E 1
8.3%
I 1
8.3%
C 1
8.3%
Decimal Number
ValueCountFrequency (%)
7 3
50.0%
1 1
 
16.7%
3 1
 
16.7%
5 1
 
16.7%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 12
60.0%
Common 8
40.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
R 2
16.7%
A 2
16.7%
N 1
8.3%
O 1
8.3%
T 1
8.3%
H 1
8.3%
M 1
8.3%
E 1
8.3%
I 1
8.3%
C 1
8.3%
Common
ValueCountFrequency (%)
7 3
37.5%
1 1
 
12.5%
3 1
 
12.5%
5 1
 
12.5%
. 1
 
12.5%
_ 1
 
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 20
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
7 3
15.0%
R 2
 
10.0%
A 2
 
10.0%
1 1
 
5.0%
3 1
 
5.0%
5 1
 
5.0%
. 1
 
5.0%
N 1
 
5.0%
O 1
 
5.0%
T 1
 
5.0%
Other values (6) 6
30.0%

eventType
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing2361472
Missing (%)> 99.9%
Memory size18.0 MiB
2025-01-08T17:46:25.977199image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length13
Mean length13
Min length13

Characters and Unicode

Total characters13
Distinct characters10
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowBaffin Island
ValueCountFrequency (%)
baffin 1
50.0%
island 1
50.0%
2025-01-08T17:46:26.067132image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 2
15.4%
f 2
15.4%
n 2
15.4%
B 1
7.7%
i 1
7.7%
1
7.7%
I 1
7.7%
s 1
7.7%
l 1
7.7%
d 1
7.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 10
76.9%
Uppercase Letter 2
 
15.4%
Space Separator 1
 
7.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 2
20.0%
f 2
20.0%
n 2
20.0%
i 1
10.0%
s 1
10.0%
l 1
10.0%
d 1
10.0%
Uppercase Letter
ValueCountFrequency (%)
B 1
50.0%
I 1
50.0%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 12
92.3%
Common 1
 
7.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 2
16.7%
f 2
16.7%
n 2
16.7%
B 1
8.3%
i 1
8.3%
I 1
8.3%
s 1
8.3%
l 1
8.3%
d 1
8.3%
Common
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 13
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 2
15.4%
f 2
15.4%
n 2
15.4%
B 1
7.7%
i 1
7.7%
1
7.7%
I 1
7.7%
s 1
7.7%
l 1
7.7%
d 1
7.7%

fieldNumber
Text

Missing 

Distinct48666
Distinct (%)24.7%
Missing2164715
Missing (%)91.7%
Memory size18.0 MiB
2025-01-08T17:46:26.251413image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length97
Median length64
Mean length12.74823895
Min length1

Characters and Unicode

Total characters2508318
Distinct characters83
Distinct categories11 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique22801 ?
Unique (%)11.6%

Sample

1st rowMMS-MAMES/B3:M4-4
2nd rowUSARP/EL/9/740/USC
3rd rowM165503; H.29-118
4th rowUSFC/A5151
5th rowUSARP/EL/6/369/USC
ValueCountFrequency (%)
vgs 4890
 
1.9%
mms-mafla/jar 4303
 
1.7%
jtw 3701
 
1.4%
bolland/rfb 1880
 
0.7%
bbc 1566
 
0.6%
humes 1397
 
0.5%
1387
 
0.5%
jpem 1304
 
0.5%
lwk 1042
 
0.4%
lk 1037
 
0.4%
Other values (46561) 233531
91.2%
2025-01-08T17:46:26.515410image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
/ 189196
 
7.5%
S 180164
 
7.2%
- 171500
 
6.8%
1 135819
 
5.4%
M 134244
 
5.4%
0 125696
 
5.0%
A 119355
 
4.8%
2 118372
 
4.7%
C 100849
 
4.0%
3 83038
 
3.3%
Other values (73) 1150085
45.9%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1136258
45.3%
Decimal Number 869208
34.7%
Other Punctuation 223269
 
8.9%
Dash Punctuation 171500
 
6.8%
Space Separator 59280
 
2.4%
Lowercase Letter 45487
 
1.8%
Connector Punctuation 1901
 
0.1%
Open Punctuation 657
 
< 0.1%
Close Punctuation 657
 
< 0.1%
Math Symbol 100
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S 180164
15.9%
M 134244
11.8%
A 119355
10.5%
C 100849
 
8.9%
U 72371
 
6.4%
F 63441
 
5.6%
L 52209
 
4.6%
I 51174
 
4.5%
R 50110
 
4.4%
B 48743
 
4.3%
Other values (16) 263598
23.2%
Lowercase Letter
ValueCountFrequency (%)
e 7455
16.4%
r 6894
15.2%
a 6685
14.7%
o 3555
7.8%
l 2812
 
6.2%
i 2524
 
5.5%
u 2399
 
5.3%
s 2313
 
5.1%
t 2095
 
4.6%
m 1992
 
4.4%
Other values (16) 6763
14.9%
Other Punctuation
ValueCountFrequency (%)
/ 189196
84.7%
: 20778
 
9.3%
; 9334
 
4.2%
. 2271
 
1.0%
, 940
 
0.4%
# 553
 
0.2%
\ 93
 
< 0.1%
? 36
 
< 0.1%
& 34
 
< 0.1%
' 18
 
< 0.1%
Other values (3) 16
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 135819
15.6%
0 125696
14.5%
2 118372
13.6%
3 83038
9.6%
5 82324
9.5%
4 72440
8.3%
7 68844
7.9%
6 67012
7.7%
8 59728
6.9%
9 55935
6.4%
Math Symbol
ValueCountFrequency (%)
+ 98
98.0%
= 2
 
2.0%
Dash Punctuation
ValueCountFrequency (%)
- 171500
100.0%
Space Separator
ValueCountFrequency (%)
59280
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1901
100.0%
Open Punctuation
ValueCountFrequency (%)
( 657
100.0%
Close Punctuation
ValueCountFrequency (%)
) 657
100.0%
Final Punctuation
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1326573
52.9%
Latin 1181745
47.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
S 180164
15.2%
M 134244
11.4%
A 119355
 
10.1%
C 100849
 
8.5%
U 72371
 
6.1%
F 63441
 
5.4%
L 52209
 
4.4%
I 51174
 
4.3%
R 50110
 
4.2%
B 48743
 
4.1%
Other values (42) 309085
26.2%
Common
ValueCountFrequency (%)
/ 189196
14.3%
- 171500
12.9%
1 135819
10.2%
0 125696
9.5%
2 118372
8.9%
3 83038
 
6.3%
5 82324
 
6.2%
4 72440
 
5.5%
7 68844
 
5.2%
6 67012
 
5.1%
Other values (21) 212332
16.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2508317
> 99.9%
Punctuation 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
/ 189196
 
7.5%
S 180164
 
7.2%
- 171500
 
6.8%
1 135819
 
5.4%
M 134244
 
5.4%
0 125696
 
5.0%
A 119355
 
4.8%
2 118372
 
4.7%
C 100849
 
4.0%
3 83038
 
3.3%
Other values (72) 1150084
45.9%
Punctuation
ValueCountFrequency (%)
1
100.0%

eventDate
Text

Missing 

Distinct79092
Distinct (%)4.1%
Missing419648
Missing (%)17.8%
Memory size18.0 MiB
2025-01-08T17:46:26.710483image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length21
Median length10
Mean length9.989250319
Min length4

Characters and Unicode

Total characters19397376
Distinct characters18
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique14082 ?
Unique (%)0.7%

Sample

1st row1981-04-24
2nd row1952-03-30
3rd row1958-08-06
4th row1900-11
5th row1988-08-20
ValueCountFrequency (%)
1915 2128
 
0.1%
1913 1918
 
0.1%
1916 1707
 
0.1%
1891 1468
 
0.1%
1982-07-21 1436
 
0.1%
1981-07-06 1349
 
0.1%
1923 1342
 
0.1%
1982-11-19 1332
 
0.1%
1880 1329
 
0.1%
1929 1317
 
0.1%
Other values (79082) 1926499
99.2%
2025-01-08T17:46:26.955163image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 3722241
19.2%
1 3671544
18.9%
0 2955423
15.2%
9 2451007
12.6%
2 1415307
 
7.3%
8 1113644
 
5.7%
7 900586
 
4.6%
6 896747
 
4.6%
3 759476
 
3.9%
5 731380
 
3.8%
Other values (8) 780021
 
4.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 15582458
80.3%
Dash Punctuation 3722241
 
19.2%
Other Punctuation 92670
 
0.5%
Lowercase Letter 6
 
< 0.1%
Uppercase Letter 1
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 3671544
23.6%
0 2955423
19.0%
9 2451007
15.7%
2 1415307
 
9.1%
8 1113644
 
7.1%
7 900586
 
5.8%
6 896747
 
5.8%
3 759476
 
4.9%
5 731380
 
4.7%
4 687344
 
4.4%
Lowercase Letter
ValueCountFrequency (%)
u 2
33.3%
n 1
16.7%
a 1
16.7%
v 1
16.7%
t 1
16.7%
Dash Punctuation
ValueCountFrequency (%)
- 3722241
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 92670
100.0%
Uppercase Letter
ValueCountFrequency (%)
N 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 19397369
> 99.9%
Latin 7
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
- 3722241
19.2%
1 3671544
18.9%
0 2955423
15.2%
9 2451007
12.6%
2 1415307
 
7.3%
8 1113644
 
5.7%
7 900586
 
4.6%
6 896747
 
4.6%
3 759476
 
3.9%
5 731380
 
3.8%
Other values (2) 780014
 
4.0%
Latin
ValueCountFrequency (%)
u 2
28.6%
N 1
14.3%
n 1
14.3%
a 1
14.3%
v 1
14.3%
t 1
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 19397376
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 3722241
19.2%
1 3671544
18.9%
0 2955423
15.2%
9 2451007
12.6%
2 1415307
 
7.3%
8 1113644
 
5.7%
7 900586
 
4.6%
6 896747
 
4.6%
3 759476
 
3.9%
5 731380
 
3.8%
Other values (8) 780021
 
4.0%

startDayOfYear
Text

Missing 

Distinct366
Distinct (%)< 0.1%
Missing669491
Missing (%)28.4%
Memory size18.0 MiB
2025-01-08T17:46:27.144914image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length2.764176569
Min length1

Characters and Unicode

Total characters4676937
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row114
2nd row90
3rd row218
4th row233
5th row189
ValueCountFrequency (%)
202 8899
 
0.5%
196 7844
 
0.5%
199 7829
 
0.5%
206 7783
 
0.5%
210 7720
 
0.5%
187 7619
 
0.5%
201 7549
 
0.4%
200 7529
 
0.4%
219 7370
 
0.4%
197 7339
 
0.4%
Other values (356) 1614501
95.4%
2025-01-08T17:46:27.529510image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 936211
20.0%
2 897252
19.2%
3 558831
11.9%
4 348360
 
7.4%
5 343495
 
7.3%
6 328341
 
7.0%
0 325792
 
7.0%
9 319075
 
6.8%
7 311518
 
6.7%
8 308062
 
6.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4676937
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 936211
20.0%
2 897252
19.2%
3 558831
11.9%
4 348360
 
7.4%
5 343495
 
7.3%
6 328341
 
7.0%
0 325792
 
7.0%
9 319075
 
6.8%
7 311518
 
6.7%
8 308062
 
6.6%

Most occurring scripts

ValueCountFrequency (%)
Common 4676937
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 936211
20.0%
2 897252
19.2%
3 558831
11.9%
4 348360
 
7.4%
5 343495
 
7.3%
6 328341
 
7.0%
0 325792
 
7.0%
9 319075
 
6.8%
7 311518
 
6.7%
8 308062
 
6.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4676937
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 936211
20.0%
2 897252
19.2%
3 558831
11.9%
4 348360
 
7.4%
5 343495
 
7.3%
6 328341
 
7.0%
0 325792
 
7.0%
9 319075
 
6.8%
7 311518
 
6.7%
8 308062
 
6.6%

endDayOfYear
Text

Missing 

Distinct367
Distinct (%)< 0.1%
Missing669490
Missing (%)28.4%
Memory size18.0 MiB
2025-01-08T17:46:27.729661image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length69
Median length3
Mean length2.765497053
Min length1

Characters and Unicode

Total characters4679174
Distinct characters36
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row114
2nd row90
3rd row218
4th row233
5th row189
ValueCountFrequency (%)
202 8881
 
0.5%
210 7894
 
0.5%
200 7857
 
0.5%
196 7822
 
0.5%
191 7738
 
0.5%
199 7653
 
0.5%
206 7645
 
0.5%
197 7551
 
0.4%
187 7541
 
0.4%
201 7478
 
0.4%
Other values (364) 1613930
95.4%
2025-01-08T17:46:27.985690image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 936531
20.0%
2 898023
19.2%
3 560326
12.0%
4 349297
 
7.5%
5 343958
 
7.4%
0 326366
 
7.0%
6 325151
 
6.9%
9 317474
 
6.8%
7 312820
 
6.7%
8 309159
 
6.6%
Other values (26) 69
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4679105
> 99.9%
Lowercase Letter 52
 
< 0.1%
Space Separator 7
 
< 0.1%
Uppercase Letter 7
 
< 0.1%
Other Punctuation 1
 
< 0.1%
Open Punctuation 1
 
< 0.1%
Close Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 11
21.2%
i 6
11.5%
t 5
9.6%
e 4
 
7.7%
o 4
 
7.7%
n 4
 
7.7%
a 3
 
5.8%
s 3
 
5.8%
b 2
 
3.8%
l 2
 
3.8%
Other values (7) 8
15.4%
Decimal Number
ValueCountFrequency (%)
1 936531
20.0%
2 898023
19.2%
3 560326
12.0%
4 349297
 
7.5%
5 343958
 
7.4%
0 326366
 
7.0%
6 325151
 
6.9%
9 317474
 
6.8%
7 312820
 
6.7%
8 309159
 
6.6%
Uppercase Letter
ValueCountFrequency (%)
F 2
28.6%
D 2
28.6%
T 1
14.3%
N 1
14.3%
H 1
14.3%
Space Separator
ValueCountFrequency (%)
7
100.0%
Other Punctuation
ValueCountFrequency (%)
, 1
100.0%
Open Punctuation
ValueCountFrequency (%)
[ 1
100.0%
Close Punctuation
ValueCountFrequency (%)
] 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 4679115
> 99.9%
Latin 59
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 11
18.6%
i 6
 
10.2%
t 5
 
8.5%
e 4
 
6.8%
o 4
 
6.8%
n 4
 
6.8%
a 3
 
5.1%
s 3
 
5.1%
F 2
 
3.4%
b 2
 
3.4%
Other values (12) 15
25.4%
Common
ValueCountFrequency (%)
1 936531
20.0%
2 898023
19.2%
3 560326
12.0%
4 349297
 
7.5%
5 343958
 
7.4%
0 326366
 
7.0%
6 325151
 
6.9%
9 317474
 
6.8%
7 312820
 
6.7%
8 309159
 
6.6%
Other values (4) 10
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4679174
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 936531
20.0%
2 898023
19.2%
3 560326
12.0%
4 349297
 
7.5%
5 343958
 
7.4%
0 326366
 
7.0%
6 325151
 
6.9%
9 317474
 
6.8%
7 312820
 
6.7%
8 309159
 
6.6%
Other values (26) 69
 
< 0.1%

year
Text

Missing 

Distinct301
Distinct (%)< 0.1%
Missing423106
Missing (%)17.9%
Memory size18.0 MiB
2025-01-08T17:46:28.179486image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters7753468
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique47 ?
Unique (%)< 0.1%

Sample

1st row1981
2nd row1952
3rd row1958
4th row1900
5th row1988
ValueCountFrequency (%)
1966 36263
 
1.9%
1967 33378
 
1.7%
1964 33172
 
1.7%
1977 31555
 
1.6%
1968 31427
 
1.6%
1965 29410
 
1.5%
1969 28034
 
1.4%
1963 25250
 
1.3%
1970 25083
 
1.3%
1971 24744
 
1.3%
Other values (291) 1640051
84.6%
2025-01-08T17:46:28.423336image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 2184214
28.2%
9 2015776
26.0%
8 680289
 
8.8%
0 504133
 
6.5%
6 492091
 
6.3%
7 448375
 
5.8%
2 417461
 
5.4%
5 340164
 
4.4%
4 339115
 
4.4%
3 331850
 
4.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 7753468
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 2184214
28.2%
9 2015776
26.0%
8 680289
 
8.8%
0 504133
 
6.5%
6 492091
 
6.3%
7 448375
 
5.8%
2 417461
 
5.4%
5 340164
 
4.4%
4 339115
 
4.4%
3 331850
 
4.3%

Most occurring scripts

ValueCountFrequency (%)
Common 7753468
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 2184214
28.2%
9 2015776
26.0%
8 680289
 
8.8%
0 504133
 
6.5%
6 492091
 
6.3%
7 448375
 
5.8%
2 417461
 
5.4%
5 340164
 
4.4%
4 339115
 
4.4%
3 331850
 
4.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7753468
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 2184214
28.2%
9 2015776
26.0%
8 680289
 
8.8%
0 504133
 
6.5%
6 492091
 
6.3%
7 448375
 
5.8%
2 417461
 
5.4%
5 340164
 
4.4%
4 339115
 
4.4%
3 331850
 
4.3%

month
Text

Missing 

Distinct12
Distinct (%)< 0.1%
Missing542654
Missing (%)23.0%
Memory size18.0 MiB
2025-01-08T17:46:28.482336image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length1
Mean length1.174502795
Min length1

Characters and Unicode

Total characters2136208
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row4
2nd row3
3rd row8
4th row11
5th row8
ValueCountFrequency (%)
7 238033
13.1%
8 219764
12.1%
6 196912
10.8%
5 185254
10.2%
4 151950
8.4%
9 150297
8.3%
3 138712
7.6%
10 121082
6.7%
2 119259
6.6%
11 110527
6.1%
Other values (2) 187029
10.3%
2025-01-08T17:46:28.584990image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 529165
24.8%
7 238033
11.1%
8 219764
10.3%
2 205039
 
9.6%
6 196912
 
9.2%
5 185254
 
8.7%
4 151950
 
7.1%
9 150297
 
7.0%
3 138712
 
6.5%
0 121082
 
5.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2136208
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 529165
24.8%
7 238033
11.1%
8 219764
10.3%
2 205039
 
9.6%
6 196912
 
9.2%
5 185254
 
8.7%
4 151950
 
7.1%
9 150297
 
7.0%
3 138712
 
6.5%
0 121082
 
5.7%

Most occurring scripts

ValueCountFrequency (%)
Common 2136208
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 529165
24.8%
7 238033
11.1%
8 219764
10.3%
2 205039
 
9.6%
6 196912
 
9.2%
5 185254
 
8.7%
4 151950
 
7.1%
9 150297
 
7.0%
3 138712
 
6.5%
0 121082
 
5.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2136208
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 529165
24.8%
7 238033
11.1%
8 219764
10.3%
2 205039
 
9.6%
6 196912
 
9.2%
5 185254
 
8.7%
4 151950
 
7.1%
9 150297
 
7.0%
3 138712
 
6.5%
0 121082
 
5.7%

day
Text

Missing 

Distinct32
Distinct (%)< 0.1%
Missing762160
Missing (%)32.3%
Memory size18.0 MiB
2025-01-08T17:46:28.653804image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length2
Mean length1.709295179
Min length1

Characters and Unicode

Total characters2733698
Distinct characters13
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row24
2nd row30
3rd row6
4th row20
5th row8
ValueCountFrequency (%)
15 57416
 
3.6%
10 57058
 
3.6%
20 56592
 
3.5%
18 55176
 
3.4%
19 55147
 
3.4%
13 54553
 
3.4%
21 54123
 
3.4%
8 53877
 
3.4%
16 53195
 
3.3%
6 52866
 
3.3%
Other values (22) 1049310
65.6%
2025-01-08T17:46:28.789830image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 725188
26.5%
2 674547
24.7%
3 231511
 
8.5%
5 161969
 
5.9%
0 160494
 
5.9%
8 159897
 
5.8%
6 156693
 
5.7%
4 155717
 
5.7%
7 154240
 
5.6%
9 153439
 
5.6%
Other values (3) 3
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2733695
> 99.9%
Uppercase Letter 3
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 725188
26.5%
2 674547
24.7%
3 231511
 
8.5%
5 161969
 
5.9%
0 160494
 
5.9%
8 159897
 
5.8%
6 156693
 
5.7%
4 155717
 
5.7%
7 154240
 
5.6%
9 153439
 
5.6%
Uppercase Letter
ValueCountFrequency (%)
G 1
33.3%
P 1
33.3%
S 1
33.3%

Most occurring scripts

ValueCountFrequency (%)
Common 2733695
> 99.9%
Latin 3
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
1 725188
26.5%
2 674547
24.7%
3 231511
 
8.5%
5 161969
 
5.9%
0 160494
 
5.9%
8 159897
 
5.8%
6 156693
 
5.7%
4 155717
 
5.7%
7 154240
 
5.6%
9 153439
 
5.6%
Latin
ValueCountFrequency (%)
G 1
33.3%
P 1
33.3%
S 1
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2733698
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 725188
26.5%
2 674547
24.7%
3 231511
 
8.5%
5 161969
 
5.9%
0 160494
 
5.9%
8 159897
 
5.8%
6 156693
 
5.7%
4 155717
 
5.7%
7 154240
 
5.6%
9 153439
 
5.6%
Other values (3) 3
 
< 0.1%

verbatimEventDate
Text

Missing 

Distinct182342
Distinct (%)16.5%
Missing1255739
Missing (%)53.2%
Memory size18.0 MiB
2025-01-08T17:46:28.981252image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length194
Median length11
Mean length13.22736662
Min length1

Characters and Unicode

Total characters14625949
Distinct characters97
Distinct categories12 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique75346 ?
Unique (%)6.8%

Sample

1st row24 APR 1981
2nd row6 Aug 1958
3rd row24 Jun 1934
4th row24 Mar 1974
5th row23-29 January 1885
ValueCountFrequency (%)
436209
 
11.7%
00 204346
 
5.5%
0000 95956
 
2.6%
aug 94310
 
2.5%
may 93493
 
2.5%
jul 93386
 
2.5%
jun 83860
 
2.3%
apr 78240
 
2.1%
mar 71932
 
1.9%
sep 67673
 
1.8%
Other values (47806) 2396571
64.5%
2025-01-08T17:46:29.253802image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2610242
17.8%
1 1589790
 
10.9%
0 1327954
 
9.1%
9 1155041
 
7.9%
- 1063519
 
7.3%
2 604441
 
4.1%
8 461134
 
3.2%
6 404101
 
2.8%
7 366280
 
2.5%
3 329735
 
2.3%
Other values (87) 4713712
32.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 6814007
46.6%
Space Separator 2610242
 
17.8%
Lowercase Letter 2418926
 
16.5%
Uppercase Letter 1341399
 
9.2%
Dash Punctuation 1063526
 
7.3%
Other Punctuation 358825
 
2.5%
Open Punctuation 9406
 
0.1%
Close Punctuation 9404
 
0.1%
Connector Punctuation 110
 
< 0.1%
Math Symbol 100
 
< 0.1%
Other values (2) 4
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
u 261683
10.8%
a 255572
10.6%
r 254995
10.5%
e 237935
 
9.8%
n 178794
 
7.4%
c 139538
 
5.8%
p 138255
 
5.7%
y 133747
 
5.5%
t 118182
 
4.9%
b 104610
 
4.3%
Other values (21) 595615
24.6%
Uppercase Letter
ValueCountFrequency (%)
J 251561
18.8%
A 224498
16.7%
M 175768
13.1%
N 91005
 
6.8%
S 85919
 
6.4%
O 77756
 
5.8%
F 69162
 
5.2%
T 53522
 
4.0%
U 46648
 
3.5%
D 46534
 
3.5%
Other values (15) 219026
16.3%
Other Punctuation
ValueCountFrequency (%)
/ 180487
50.3%
: 100182
27.9%
; 32334
 
9.0%
. 27778
 
7.7%
, 15385
 
4.3%
' 1482
 
0.4%
* 612
 
0.2%
? 279
 
0.1%
! 136
 
< 0.1%
& 104
 
< 0.1%
Other values (3) 46
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 1589790
23.3%
0 1327954
19.5%
9 1155041
17.0%
2 604441
 
8.9%
8 461134
 
6.8%
6 404101
 
5.9%
7 366280
 
5.4%
3 329735
 
4.8%
5 296345
 
4.3%
4 279186
 
4.1%
Math Symbol
ValueCountFrequency (%)
| 48
48.0%
+ 42
42.0%
= 6
 
6.0%
~ 2
 
2.0%
< 1
 
1.0%
± 1
 
1.0%
Open Punctuation
ValueCountFrequency (%)
[ 8644
91.9%
( 759
 
8.1%
{ 3
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
] 8641
91.9%
) 760
 
8.1%
} 3
 
< 0.1%
Dash Punctuation
ValueCountFrequency (%)
- 1063519
> 99.9%
7
 
< 0.1%
Space Separator
ValueCountFrequency (%)
2610242
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 110
100.0%
Other Number
ValueCountFrequency (%)
½ 2
100.0%
Format
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 10865624
74.3%
Latin 3760325
 
25.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
u 261683
 
7.0%
a 255572
 
6.8%
r 254995
 
6.8%
J 251561
 
6.7%
e 237935
 
6.3%
A 224498
 
6.0%
n 178794
 
4.8%
M 175768
 
4.7%
c 139538
 
3.7%
p 138255
 
3.7%
Other values (46) 1641726
43.7%
Common
ValueCountFrequency (%)
2610242
24.0%
1 1589790
14.6%
0 1327954
12.2%
9 1155041
10.6%
- 1063519
9.8%
2 604441
 
5.6%
8 461134
 
4.2%
6 404101
 
3.7%
7 366280
 
3.4%
3 329735
 
3.0%
Other values (31) 953387
 
8.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 14625911
> 99.9%
None 29
 
< 0.1%
Punctuation 9
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2610242
17.8%
1 1589790
 
10.9%
0 1327954
 
9.1%
9 1155041
 
7.9%
- 1063519
 
7.3%
2 604441
 
4.1%
8 461134
 
3.2%
6 404101
 
2.8%
7 366280
 
2.5%
3 329735
 
2.3%
Other values (78) 4713674
32.2%
None
ValueCountFrequency (%)
é 14
48.3%
û 5
 
17.2%
ü 3
 
10.3%
ä 2
 
6.9%
ô 2
 
6.9%
½ 2
 
6.9%
± 1
 
3.4%
Punctuation
ValueCountFrequency (%)
7
77.8%
2
 
22.2%

habitat
Text

Missing 

Distinct72844
Distinct (%)39.6%
Missing2177646
Missing (%)92.2%
Memory size18.0 MiB
2025-01-08T17:46:29.448275image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length795
Median length504
Mean length30.83063424
Min length1

Characters and Unicode

Total characters5667503
Distinct characters131
Distinct categories16 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique57211 ?
Unique (%)31.1%

Sample

1st rowabandoned field
2nd rowIn wet mixed hardwood-pine-podocarpus forest.
3rd rowEcological remarks by collector(s): yes
4th rowRainforest
5th rowTropical dry forest
ValueCountFrequency (%)
forest 43320
 
5.0%
on 24894
 
2.9%
and 21426
 
2.5%
in 20924
 
2.4%
with 15194
 
1.8%
of 14970
 
1.7%
by 14737
 
1.7%
remarks 12371
 
1.4%
ecological 12371
 
1.4%
collector(s 12367
 
1.4%
Other values (24035) 672995
77.8%
2025-01-08T17:46:29.721887image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
681742
 
12.0%
e 506903
 
8.9%
a 436200
 
7.7%
o 418533
 
7.4%
r 374871
 
6.6%
s 360402
 
6.4%
n 320370
 
5.7%
i 282376
 
5.0%
t 273985
 
4.8%
l 250910
 
4.4%
Other values (121) 1761211
31.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4546359
80.2%
Space Separator 681742
 
12.0%
Uppercase Letter 241805
 
4.3%
Other Punctuation 139735
 
2.5%
Close Punctuation 15495
 
0.3%
Open Punctuation 15478
 
0.3%
Decimal Number 14355
 
0.3%
Dash Punctuation 11022
 
0.2%
Math Symbol 1458
 
< 0.1%
Other Symbol 31
 
< 0.1%
Other values (6) 23
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 506903
11.1%
a 436200
 
9.6%
o 418533
 
9.2%
r 374871
 
8.2%
s 360402
 
7.9%
n 320370
 
7.0%
i 282376
 
6.2%
t 273985
 
6.0%
l 250910
 
5.5%
d 202477
 
4.5%
Other values (41) 1119332
24.6%
Uppercase Letter
ValueCountFrequency (%)
S 27255
 
11.3%
E 19870
 
8.2%
M 19282
 
8.0%
C 15226
 
6.3%
R 14950
 
6.2%
P 14353
 
5.9%
O 13923
 
5.8%
F 13595
 
5.6%
A 13336
 
5.5%
T 12982
 
5.4%
Other values (20) 77033
31.9%
Other Punctuation
ValueCountFrequency (%)
, 56768
40.6%
. 53985
38.6%
: 13864
 
9.9%
; 7795
 
5.6%
& 2835
 
2.0%
/ 1969
 
1.4%
" 1107
 
0.8%
' 725
 
0.5%
? 229
 
0.2%
% 223
 
0.2%
Other values (6) 235
 
0.2%
Decimal Number
ValueCountFrequency (%)
0 4308
30.0%
1 1914
13.3%
2 1738
12.1%
3 1723
 
12.0%
5 1686
 
11.7%
4 1145
 
8.0%
6 604
 
4.2%
8 507
 
3.5%
9 366
 
2.5%
7 364
 
2.5%
Math Symbol
ValueCountFrequency (%)
+ 731
50.1%
~ 506
34.7%
| 136
 
9.3%
± 41
 
2.8%
= 31
 
2.1%
< 8
 
0.5%
> 5
 
0.3%
Close Punctuation
ValueCountFrequency (%)
) 15265
98.5%
] 168
 
1.1%
} 62
 
0.4%
Open Punctuation
ValueCountFrequency (%)
( 15251
98.5%
[ 165
 
1.1%
{ 62
 
0.4%
Dash Punctuation
ValueCountFrequency (%)
- 11006
99.9%
8
 
0.1%
8
 
0.1%
Space Separator
ValueCountFrequency (%)
681742
100.0%
Other Symbol
ValueCountFrequency (%)
° 31
100.0%
Other Letter
ValueCountFrequency (%)
º 8
100.0%
Final Punctuation
ValueCountFrequency (%)
6
100.0%
Initial Punctuation
ValueCountFrequency (%)
4
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 3
100.0%
Modifier Symbol
ValueCountFrequency (%)
´ 1
100.0%
Other Number
ValueCountFrequency (%)
² 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4788172
84.5%
Common 879331
 
15.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 506903
 
10.6%
a 436200
 
9.1%
o 418533
 
8.7%
r 374871
 
7.8%
s 360402
 
7.5%
n 320370
 
6.7%
i 282376
 
5.9%
t 273985
 
5.7%
l 250910
 
5.2%
d 202477
 
4.2%
Other values (72) 1361145
28.4%
Common
ValueCountFrequency (%)
681742
77.5%
, 56768
 
6.5%
. 53985
 
6.1%
) 15265
 
1.7%
( 15251
 
1.7%
: 13864
 
1.6%
- 11006
 
1.3%
; 7795
 
0.9%
0 4308
 
0.5%
& 2835
 
0.3%
Other values (39) 16512
 
1.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5659558
99.9%
None 7888
 
0.1%
Punctuation 57
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
681742
 
12.0%
e 506903
 
9.0%
a 436200
 
7.7%
o 418533
 
7.4%
r 374871
 
6.6%
s 360402
 
6.4%
n 320370
 
5.7%
i 282376
 
5.0%
t 273985
 
4.8%
l 250910
 
4.4%
Other values (82) 1753266
31.0%
None
ValueCountFrequency (%)
ú 1179
14.9%
ê 1157
14.7%
é 1124
14.2%
ó 1102
14.0%
í 913
11.6%
á 821
10.4%
ñ 640
8.1%
è 414
 
5.2%
à 133
 
1.7%
ã 61
 
0.8%
Other values (24) 344
 
4.4%
Punctuation
ValueCountFrequency (%)
31
54.4%
8
 
14.0%
8
 
14.0%
6
 
10.5%
4
 
7.0%

samplingEffort
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing2361472
Missing (%)> 99.9%
Memory size18.0 MiB
2025-01-08T17:46:29.774977image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters4
Distinct characters4
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row67.0
ValueCountFrequency (%)
67.0 1
100.0%
2025-01-08T17:46:29.863342image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6 1
25.0%
7 1
25.0%
. 1
25.0%
0 1
25.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3
75.0%
Other Punctuation 1
 
25.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
6 1
33.3%
7 1
33.3%
0 1
33.3%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 4
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
6 1
25.0%
7 1
25.0%
. 1
25.0%
0 1
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
6 1
25.0%
7 1
25.0%
. 1
25.0%
0 1
25.0%

fieldNotes
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing2361472
Missing (%)> 99.9%
Memory size18.0 MiB
2025-01-08T17:46:29.903342image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters5
Distinct characters5
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row-63.0
ValueCountFrequency (%)
63.0 1
100.0%
2025-01-08T17:46:29.990881image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 1
20.0%
6 1
20.0%
3 1
20.0%
. 1
20.0%
0 1
20.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3
60.0%
Dash Punctuation 1
 
20.0%
Other Punctuation 1
 
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
6 1
33.3%
3 1
33.3%
0 1
33.3%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 5
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
- 1
20.0%
6 1
20.0%
3 1
20.0%
. 1
20.0%
0 1
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 1
20.0%
6 1
20.0%
3 1
20.0%
. 1
20.0%
0 1
20.0%

locationID
Text

Missing 

Distinct50052
Distinct (%)18.1%
Missing2084512
Missing (%)88.3%
Memory size18.0 MiB
2025-01-08T17:46:30.169313image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length77799
Median length131
Mean length4.843338954
Min length1

Characters and Unicode

Total characters1341416
Distinct characters99
Distinct categories11 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique28095 ?
Unique (%)10.1%

Sample

1st row31
2nd rowGS 03383
3rd rowM4
4th row9
5th row68-36
ValueCountFrequency (%)
d 3566
 
1.1%
not 3178
 
1.0%
rec 3070
 
1.0%
4 2339
 
0.7%
1 2281
 
0.7%
rhb 1929
 
0.6%
rfb 1883
 
0.6%
2 1847
 
0.6%
3 1546
 
0.5%
6 1528
 
0.5%
Other values (43774) 294784
92.7%
2025-01-08T17:46:30.439809image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 143532
 
10.7%
2 119590
 
8.9%
0 97991
 
7.3%
- 87059
 
6.5%
3 86532
 
6.5%
5 86435
 
6.4%
4 83135
 
6.2%
6 74491
 
5.6%
7 59619
 
4.4%
8 54830
 
4.1%
Other values (89) 448202
33.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 856082
63.8%
Uppercase Letter 265572
 
19.8%
Dash Punctuation 87060
 
6.5%
Lowercase Letter 52220
 
3.9%
Space Separator 36726
 
2.7%
Other Punctuation 24797
 
1.8%
Control 13837
 
1.0%
Connector Punctuation 2785
 
0.2%
Open Punctuation 1126
 
0.1%
Close Punctuation 1009
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 6785
13.0%
t 5566
10.7%
o 5269
10.1%
e 5025
9.6%
i 4772
9.1%
n 3806
 
7.3%
r 3521
 
6.7%
l 2693
 
5.2%
s 2124
 
4.1%
u 2087
 
4.0%
Other values (26) 10572
20.2%
Uppercase Letter
ValueCountFrequency (%)
A 27515
 
10.4%
S 24535
 
9.2%
C 21215
 
8.0%
B 18453
 
6.9%
R 17107
 
6.4%
M 16994
 
6.4%
N 16161
 
6.1%
E 15064
 
5.7%
I 13429
 
5.1%
T 12946
 
4.9%
Other values (18) 82153
30.9%
Other Punctuation
ValueCountFrequency (%)
: 10678
43.1%
. 8883
35.8%
, 2581
 
10.4%
/ 1586
 
6.4%
# 425
 
1.7%
; 332
 
1.3%
& 182
 
0.7%
? 70
 
0.3%
* 30
 
0.1%
' 22
 
0.1%
Other values (2) 8
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 143532
16.8%
2 119590
14.0%
0 97991
11.4%
3 86532
10.1%
5 86435
10.1%
4 83135
9.7%
6 74491
8.7%
7 59619
7.0%
8 54830
 
6.4%
9 49927
 
5.8%
Math Symbol
ValueCountFrequency (%)
+ 196
97.0%
= 3
 
1.5%
| 3
 
1.5%
Dash Punctuation
ValueCountFrequency (%)
- 87059
> 99.9%
1
 
< 0.1%
Control
ValueCountFrequency (%)
13775
99.6%
62
 
0.4%
Open Punctuation
ValueCountFrequency (%)
( 1047
93.0%
[ 79
 
7.0%
Close Punctuation
ValueCountFrequency (%)
) 930
92.2%
] 79
 
7.8%
Space Separator
ValueCountFrequency (%)
36726
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 2785
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1023624
76.3%
Latin 317792
 
23.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 27515
 
8.7%
S 24535
 
7.7%
C 21215
 
6.7%
B 18453
 
5.8%
R 17107
 
5.4%
M 16994
 
5.3%
N 16161
 
5.1%
E 15064
 
4.7%
I 13429
 
4.2%
T 12946
 
4.1%
Other values (54) 134373
42.3%
Common
ValueCountFrequency (%)
1 143532
14.0%
2 119590
11.7%
0 97991
9.6%
- 87059
8.5%
3 86532
8.5%
5 86435
8.4%
4 83135
8.1%
6 74491
7.3%
7 59619
5.8%
8 54830
 
5.4%
Other values (25) 130410
12.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1341390
> 99.9%
None 25
 
< 0.1%
Punctuation 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 143532
 
10.7%
2 119590
 
8.9%
0 97991
 
7.3%
- 87059
 
6.5%
3 86532
 
6.5%
5 86435
 
6.4%
4 83135
 
6.2%
6 74491
 
5.6%
7 59619
 
4.4%
8 54830
 
4.1%
Other values (76) 448176
33.4%
None
ValueCountFrequency (%)
é 5
20.0%
ü 4
16.0%
ä 4
16.0%
Ö 2
 
8.0%
í 2
 
8.0%
á 2
 
8.0%
ã 1
 
4.0%
å 1
 
4.0%
ö 1
 
4.0%
è 1
 
4.0%
Other values (2) 2
 
8.0%
Punctuation
ValueCountFrequency (%)
1
100.0%

higherGeography
Text

Missing 

Distinct48477
Distinct (%)2.1%
Missing73521
Missing (%)3.1%
Memory size18.0 MiB
2025-01-08T17:46:30.632007image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length177
Median length138
Mean length40.44184187
Min length4

Characters and Unicode

Total characters92528993
Distinct characters175
Distinct categories12 ?
Distinct scripts2 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique15741 ?
Unique (%)0.7%

Sample

1st rowNorth Atlantic Ocean, Caribbean Sea, Belize
2nd rowNorth America, United States, Tennessee
3rd rowNorth America, United States, West Virginia, Randolph
4th rowUnited States, Georgia, Decatur County
5th rowNorth Atlantic Ocean, Gulf of Mexico, United States
ValueCountFrequency (%)
america 1138439
 
9.2%
north 1106434
 
8.9%
united 860922
 
6.9%
states 853131
 
6.9%
440814
 
3.5%
south 440485
 
3.5%
ocean 430252
 
3.5%
neotropics 407966
 
3.3%
atlantic 224389
 
1.8%
pacific 213984
 
1.7%
Other values (16557) 6300876
50.7%
2025-01-08T17:46:30.895398image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
10129740
 
10.9%
a 8943831
 
9.7%
i 6761301
 
7.3%
e 6630309
 
7.2%
t 6497959
 
7.0%
r 5098061
 
5.5%
o 4988716
 
5.4%
, 4762006
 
5.1%
n 4606108
 
5.0%
c 3725697
 
4.0%
Other values (165) 30385265
32.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 64845487
70.1%
Uppercase Letter 12007518
 
13.0%
Space Separator 10129740
 
10.9%
Other Punctuation 4831886
 
5.2%
Dash Punctuation 597716
 
0.6%
Open Punctuation 58149
 
0.1%
Close Punctuation 58139
 
0.1%
Modifier Letter 149
 
< 0.1%
Math Symbol 90
 
< 0.1%
Decimal Number 73
 
< 0.1%
Other values (2) 46
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 8943831
13.8%
i 6761301
10.4%
e 6630309
10.2%
t 6497959
10.0%
r 5098061
7.9%
o 4988716
7.7%
n 4606108
 
7.1%
c 3725697
 
5.7%
s 3264213
 
5.0%
l 2250917
 
3.5%
Other values (82) 12078375
18.6%
Uppercase Letter
ValueCountFrequency (%)
A 2049264
17.1%
N 1767234
14.7%
S 1721940
14.3%
U 927441
7.7%
C 879326
7.3%
P 645016
 
5.4%
M 559337
 
4.7%
O 533657
 
4.4%
I 403326
 
3.4%
T 335754
 
2.8%
Other values (37) 2185223
18.2%
Other Punctuation
ValueCountFrequency (%)
, 4762006
98.6%
. 43825
 
0.9%
' 17324
 
0.4%
/ 6660
 
0.1%
? 1684
 
< 0.1%
; 292
 
< 0.1%
& 39
 
< 0.1%
* 27
 
< 0.1%
: 24
 
< 0.1%
" 2
 
< 0.1%
Other values (2) 3
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
3 18
24.7%
2 18
24.7%
1 17
23.3%
0 9
12.3%
4 4
 
5.5%
8 2
 
2.7%
6 2
 
2.7%
9 2
 
2.7%
7 1
 
1.4%
Dash Punctuation
ValueCountFrequency (%)
- 597579
> 99.9%
136
 
< 0.1%
1
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
[ 47065
80.9%
( 11084
 
19.1%
Close Punctuation
ValueCountFrequency (%)
] 47054
80.9%
) 11085
 
19.1%
Modifier Letter
ValueCountFrequency (%)
ʻ 128
85.9%
ʼ 21
 
14.1%
Math Symbol
ValueCountFrequency (%)
= 87
96.7%
+ 3
 
3.3%
Modifier Symbol
ValueCountFrequency (%)
´ 34
97.1%
¸ 1
 
2.9%
Space Separator
ValueCountFrequency (%)
10129740
100.0%
Format
ValueCountFrequency (%)
11
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 76853005
83.1%
Common 15675988
 
16.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 8943831
 
11.6%
i 6761301
 
8.8%
e 6630309
 
8.6%
t 6497959
 
8.5%
r 5098061
 
6.6%
o 4988716
 
6.5%
n 4606108
 
6.0%
c 3725697
 
4.8%
s 3264213
 
4.2%
l 2250917
 
2.9%
Other values (129) 24085893
31.3%
Common
ValueCountFrequency (%)
10129740
64.6%
, 4762006
30.4%
- 597579
 
3.8%
[ 47065
 
0.3%
] 47054
 
0.3%
. 43825
 
0.3%
' 17324
 
0.1%
) 11085
 
0.1%
( 11084
 
0.1%
/ 6660
 
< 0.1%
Other values (26) 2566
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 92396075
99.9%
None 132576
 
0.1%
Modifier Letters 149
 
< 0.1%
Punctuation 148
 
< 0.1%
Latin Ext Additional 45
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
10129740
 
11.0%
a 8943831
 
9.7%
i 6761301
 
7.3%
e 6630309
 
7.2%
t 6497959
 
7.0%
r 5098061
 
5.5%
o 4988716
 
5.4%
, 4762006
 
5.2%
n 4606108
 
5.0%
c 3725697
 
4.0%
Other values (70) 30252347
32.7%
None
ValueCountFrequency (%)
á 42738
32.2%
í 24913
18.8%
é 22834
17.2%
ó 16557
 
12.5%
ã 8603
 
6.5%
ô 3849
 
2.9%
ç 2216
 
1.7%
ñ 2003
 
1.5%
Î 1675
 
1.3%
ü 1625
 
1.2%
Other values (67) 5563
 
4.2%
Punctuation
ValueCountFrequency (%)
136
91.9%
11
 
7.4%
1
 
0.7%
Modifier Letters
ValueCountFrequency (%)
ʻ 128
85.9%
ʼ 21
 
14.1%
Latin Ext Additional
ValueCountFrequency (%)
10
22.2%
8
17.8%
5
11.1%
4
 
8.9%
ế 3
 
6.7%
3
 
6.7%
3
 
6.7%
3
 
6.7%
2
 
4.4%
1
 
2.2%
Other values (3) 3
 
6.7%

continent
Text

Missing 

Distinct7
Distinct (%)< 0.1%
Missing411637
Missing (%)17.4%
Memory size18.0 MiB
2025-01-08T17:46:30.954987image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length13
Mean length10.839028
Min length4

Characters and Unicode

Total characters21134327
Distinct characters15
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNORTH_AMERICA
2nd rowNORTH_AMERICA
3rd rowNORTH_AMERICA
4th rowNORTH_AMERICA
5th rowASIA
ValueCountFrequency (%)
north_america 1041974
53.4%
south_america 357651
 
18.3%
asia 249517
 
12.8%
oceania 115158
 
5.9%
africa 104098
 
5.3%
europe 75985
 
3.9%
antarctica 5453
 
0.3%
2025-01-08T17:46:31.057487image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 3753155
17.8%
R 2627135
12.4%
I 1873851
8.9%
E 1666753
7.9%
C 1629787
7.7%
O 1590768
7.5%
T 1410531
 
6.7%
H 1399625
 
6.6%
_ 1399625
 
6.6%
M 1399625
 
6.6%
Other values (5) 2383472
11.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 19734702
93.4%
Connector Punctuation 1399625
 
6.6%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 3753155
19.0%
R 2627135
13.3%
I 1873851
9.5%
E 1666753
8.4%
C 1629787
8.3%
O 1590768
8.1%
T 1410531
 
7.1%
H 1399625
 
7.1%
M 1399625
 
7.1%
N 1162585
 
5.9%
Other values (4) 1220887
 
6.2%
Connector Punctuation
ValueCountFrequency (%)
_ 1399625
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 19734702
93.4%
Common 1399625
 
6.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 3753155
19.0%
R 2627135
13.3%
I 1873851
9.5%
E 1666753
8.4%
C 1629787
8.3%
O 1590768
8.1%
T 1410531
 
7.1%
H 1399625
 
7.1%
M 1399625
 
7.1%
N 1162585
 
5.9%
Other values (4) 1220887
 
6.2%
Common
ValueCountFrequency (%)
_ 1399625
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 21134327
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 3753155
17.8%
R 2627135
12.4%
I 1873851
8.9%
E 1666753
7.9%
C 1629787
7.7%
O 1590768
7.5%
T 1410531
 
6.7%
H 1399625
 
6.6%
_ 1399625
 
6.6%
M 1399625
 
6.6%
Other values (5) 2383472
11.3%

waterBody
Text

Missing 

Distinct2466
Distinct (%)0.6%
Missing1923759
Missing (%)81.5%
Memory size18.0 MiB
2025-01-08T17:46:31.216407image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length75
Median length73
Mean length24.15323019
Min length6

Characters and Unicode

Total characters10572207
Distinct characters73
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1005 ?
Unique (%)0.2%

Sample

1st rowNorth Atlantic Ocean, Caribbean Sea
2nd rowNorth Atlantic Ocean, Gulf of Mexico
3rd rowNorth Atlantic Ocean, Gulf of Mexico, Galveston Bay
4th rowNorth Pacific Ocean, Gulf of California
5th rowNorth Atlantic Ocean, Gulf of Guinea
ValueCountFrequency (%)
ocean 429243
26.0%
north 326216
19.8%
atlantic 224090
13.6%
pacific 174691
10.6%
of 70763
 
4.3%
sea 70402
 
4.3%
gulf 69641
 
4.2%
south 61315
 
3.7%
mexico 54265
 
3.3%
caribbean 31788
 
1.9%
Other values (1777) 138016
 
8.4%
2025-01-08T17:46:31.453074image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1212716
11.5%
a 1102411
10.4%
c 1096281
10.4%
t 883208
 
8.4%
n 794119
 
7.5%
i 741797
 
7.0%
e 632056
 
6.0%
o 541092
 
5.1%
O 432028
 
4.1%
r 407634
 
3.9%
Other values (63) 2728865
25.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 7617530
72.1%
Uppercase Letter 1581582
 
15.0%
Space Separator 1212716
 
11.5%
Other Punctuation 159679
 
1.5%
Dash Punctuation 534
 
< 0.1%
Modifier Letter 122
 
< 0.1%
Open Punctuation 22
 
< 0.1%
Close Punctuation 22
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 1102411
14.5%
c 1096281
14.4%
t 883208
11.6%
n 794119
10.4%
i 741797
9.7%
e 632056
8.3%
o 541092
7.1%
r 407634
 
5.4%
h 400540
 
5.3%
f 319349
 
4.2%
Other values (23) 699043
9.2%
Uppercase Letter
ValueCountFrequency (%)
O 432028
27.3%
N 327077
20.7%
A 243543
15.4%
P 180869
11.4%
S 145435
 
9.2%
G 71397
 
4.5%
M 64603
 
4.1%
C 46516
 
2.9%
B 26617
 
1.7%
I 22994
 
1.5%
Other values (16) 20503
 
1.3%
Other Punctuation
ValueCountFrequency (%)
, 158850
99.5%
; 291
 
0.2%
' 213
 
0.1%
. 161
 
0.1%
/ 124
 
0.1%
? 21
 
< 0.1%
: 19
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
( 21
95.5%
[ 1
 
4.5%
Close Punctuation
ValueCountFrequency (%)
) 21
95.5%
] 1
 
4.5%
Space Separator
ValueCountFrequency (%)
1212716
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 534
100.0%
Modifier Letter
ValueCountFrequency (%)
ʻ 122
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 9199112
87.0%
Common 1373095
 
13.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 1102411
12.0%
c 1096281
11.9%
t 883208
9.6%
n 794119
 
8.6%
i 741797
 
8.1%
e 632056
 
6.9%
o 541092
 
5.9%
O 432028
 
4.7%
r 407634
 
4.4%
h 400540
 
4.4%
Other values (49) 2167946
23.6%
Common
ValueCountFrequency (%)
1212716
88.3%
, 158850
 
11.6%
- 534
 
< 0.1%
; 291
 
< 0.1%
' 213
 
< 0.1%
. 161
 
< 0.1%
/ 124
 
< 0.1%
ʻ 122
 
< 0.1%
( 21
 
< 0.1%
) 21
 
< 0.1%
Other values (4) 42
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10571819
> 99.9%
None 266
 
< 0.1%
Modifier Letters 122
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1212716
11.5%
a 1102411
10.4%
c 1096281
10.4%
t 883208
 
8.4%
n 794119
 
7.5%
i 741797
 
7.0%
e 632056
 
6.0%
o 541092
 
5.1%
O 432028
 
4.1%
r 407634
 
3.9%
Other values (54) 2728477
25.8%
Modifier Letters
ValueCountFrequency (%)
ʻ 122
100.0%
None
ValueCountFrequency (%)
ā 122
45.9%
í 57
21.4%
á 33
 
12.4%
ñ 23
 
8.6%
ó 13
 
4.9%
é 12
 
4.5%
è 5
 
1.9%
É 1
 
0.4%

islandGroup
Text

Missing 

Distinct655
Distinct (%)1.3%
Missing2309219
Missing (%)97.8%
Memory size18.0 MiB
2025-01-08T17:46:31.634683image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length45
Median length41
Mean length14.63855399
Min length4

Characters and Unicode

Total characters764923
Distinct characters69
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique152 ?
Unique (%)0.3%

Sample

1st rowPelican Cays
2nd rowGreater Antilles
3rd rowStewart Islands
4th rowRalik Chain
5th rowVirgin Islands
ValueCountFrequency (%)
islands 18220
 
16.0%
antilles 8706
 
7.7%
greater 8542
 
7.5%
group 7726
 
6.8%
is 5006
 
4.4%
leeward 2799
 
2.5%
new 2396
 
2.1%
hispaniola 2301
 
2.0%
chain 2114
 
1.9%
virgin 1728
 
1.5%
Other values (552) 54220
47.7%
2025-01-08T17:46:31.875689image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 89318
 
11.7%
s 68956
 
9.0%
61504
 
8.0%
n 56352
 
7.4%
l 54957
 
7.2%
e 53661
 
7.0%
r 46030
 
6.0%
i 39007
 
5.1%
d 31951
 
4.2%
t 28112
 
3.7%
Other values (59) 235075
30.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 583346
76.3%
Uppercase Letter 112247
 
14.7%
Space Separator 61504
 
8.0%
Other Punctuation 5467
 
0.7%
Open Punctuation 1168
 
0.2%
Close Punctuation 1168
 
0.2%
Dash Punctuation 11
 
< 0.1%
Format 11
 
< 0.1%
Math Symbol 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 89318
15.3%
s 68956
11.8%
n 56352
9.7%
l 54957
9.4%
e 53661
9.2%
r 46030
7.9%
i 39007
6.7%
d 31951
 
5.5%
t 28112
 
4.8%
o 25531
 
4.4%
Other values (20) 89471
15.3%
Uppercase Letter
ValueCountFrequency (%)
I 25086
22.3%
G 19795
17.6%
A 10876
9.7%
C 8571
 
7.6%
V 5998
 
5.3%
L 5665
 
5.0%
S 5499
 
4.9%
B 4199
 
3.7%
N 3529
 
3.1%
R 3409
 
3.0%
Other values (17) 19620
17.5%
Other Punctuation
ValueCountFrequency (%)
. 5004
91.5%
' 455
 
8.3%
, 6
 
0.1%
? 2
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
( 692
59.2%
[ 476
40.8%
Close Punctuation
ValueCountFrequency (%)
) 692
59.2%
] 476
40.8%
Space Separator
ValueCountFrequency (%)
61504
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 11
100.0%
Format
ValueCountFrequency (%)
11
100.0%
Math Symbol
ValueCountFrequency (%)
= 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 695593
90.9%
Common 69330
 
9.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 89318
12.8%
s 68956
 
9.9%
n 56352
 
8.1%
l 54957
 
7.9%
e 53661
 
7.7%
r 46030
 
6.6%
i 39007
 
5.6%
d 31951
 
4.6%
t 28112
 
4.0%
o 25531
 
3.7%
Other values (47) 201718
29.0%
Common
ValueCountFrequency (%)
61504
88.7%
. 5004
 
7.2%
( 692
 
1.0%
) 692
 
1.0%
] 476
 
0.7%
[ 476
 
0.7%
' 455
 
0.7%
- 11
 
< 0.1%
11
 
< 0.1%
, 6
 
< 0.1%
Other values (2) 3
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 762646
99.7%
None 2266
 
0.3%
Punctuation 11
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 89318
 
11.7%
s 68956
 
9.0%
61504
 
8.1%
n 56352
 
7.4%
l 54957
 
7.2%
e 53661
 
7.0%
r 46030
 
6.0%
i 39007
 
5.1%
d 31951
 
4.2%
t 28112
 
3.7%
Other values (52) 232798
30.5%
None
ValueCountFrequency (%)
Î 1196
52.8%
á 1048
46.2%
Ō 16
 
0.7%
ñ 4
 
0.2%
ù 1
 
< 0.1%
à 1
 
< 0.1%
Punctuation
ValueCountFrequency (%)
11
100.0%

island
Text

Missing 

Distinct4075
Distinct (%)2.6%
Missing2204401
Missing (%)93.3%
Memory size18.0 MiB
2025-01-08T17:46:32.054382image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length47
Median length41
Mean length9.542050779
Min length3

Characters and Unicode

Total characters1498789
Distinct characters87
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1250 ?
Unique (%)0.8%

Sample

1st rowHonshu
2nd rowLana'i
3rd rowCat Cay
4th rowHawaii
5th rowSumatra
ValueCountFrequency (%)
island 26207
 
10.9%
hispaniola 12814
 
5.3%
cuba 6496
 
2.7%
oahu 6126
 
2.5%
atoll 5648
 
2.3%
luzon 5340
 
2.2%
new 4804
 
2.0%
bermuda 4124
 
1.7%
guinea 3811
 
1.6%
st 3730
 
1.5%
Other values (3177) 162065
67.2%
2025-01-08T17:46:32.296753image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 230929
15.4%
n 107175
 
7.2%
i 100018
 
6.7%
o 93269
 
6.2%
84093
 
5.6%
l 83151
 
5.5%
e 75481
 
5.0%
u 73701
 
4.9%
s 67954
 
4.5%
r 59918
 
4.0%
Other values (77) 523100
34.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1155574
77.1%
Uppercase Letter 236803
 
15.8%
Space Separator 84093
 
5.6%
Other Punctuation 11708
 
0.8%
Close Punctuation 4956
 
0.3%
Open Punctuation 4953
 
0.3%
Dash Punctuation 695
 
< 0.1%
Decimal Number 6
 
< 0.1%
Modifier Letter 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 230929
20.0%
n 107175
9.3%
i 100018
8.7%
o 93269
8.1%
l 83151
 
7.2%
e 75481
 
6.5%
u 73701
 
6.4%
s 67954
 
5.9%
r 59918
 
5.2%
d 53485
 
4.6%
Other values (33) 210493
18.2%
Uppercase Letter
ValueCountFrequency (%)
I 33564
14.2%
C 22392
 
9.5%
H 22087
 
9.3%
B 17981
 
7.6%
S 17953
 
7.6%
M 15275
 
6.5%
T 11260
 
4.8%
A 10790
 
4.6%
G 10713
 
4.5%
L 10554
 
4.5%
Other values (18) 64234
27.1%
Other Punctuation
ValueCountFrequency (%)
. 5656
48.3%
' 5655
48.3%
, 354
 
3.0%
? 34
 
0.3%
/ 9
 
0.1%
Decimal Number
ValueCountFrequency (%)
0 2
33.3%
3 2
33.3%
2 1
16.7%
6 1
16.7%
Close Punctuation
ValueCountFrequency (%)
] 3618
73.0%
) 1338
 
27.0%
Open Punctuation
ValueCountFrequency (%)
[ 3618
73.0%
( 1335
 
27.0%
Space Separator
ValueCountFrequency (%)
84093
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 695
100.0%
Modifier Letter
ValueCountFrequency (%)
ʻ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1392377
92.9%
Common 106412
 
7.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 230929
16.6%
n 107175
 
7.7%
i 100018
 
7.2%
o 93269
 
6.7%
l 83151
 
6.0%
e 75481
 
5.4%
u 73701
 
5.3%
s 67954
 
4.9%
r 59918
 
4.3%
d 53485
 
3.8%
Other values (61) 447296
32.1%
Common
ValueCountFrequency (%)
84093
79.0%
. 5656
 
5.3%
' 5655
 
5.3%
] 3618
 
3.4%
[ 3618
 
3.4%
) 1338
 
1.3%
( 1335
 
1.3%
- 695
 
0.7%
, 354
 
0.3%
? 34
 
< 0.1%
Other values (6) 16
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1496976
99.9%
None 1808
 
0.1%
Latin Ext Additional 4
 
< 0.1%
Modifier Letters 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 230929
15.4%
n 107175
 
7.2%
i 100018
 
6.7%
o 93269
 
6.2%
84093
 
5.6%
l 83151
 
5.6%
e 75481
 
5.0%
u 73701
 
4.9%
s 67954
 
4.5%
r 59918
 
4.0%
Other values (56) 521287
34.8%
None
ValueCountFrequency (%)
ç 458
25.3%
Î 396
21.9%
ó 249
13.8%
é 247
13.7%
á 175
 
9.7%
â 101
 
5.6%
ñ 69
 
3.8%
ã 48
 
2.7%
í 17
 
0.9%
Ö 14
 
0.8%
Other values (9) 34
 
1.9%
Latin Ext Additional
ValueCountFrequency (%)
4
100.0%
Modifier Letters
ValueCountFrequency (%)
ʻ 1
100.0%

countryCode
Text

Missing 

Distinct247
Distinct (%)< 0.1%
Missing95309
Missing (%)4.0%
Memory size18.0 MiB
2025-01-08T17:46:32.463562image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters4532328
Distinct characters26
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st rowBZ
2nd rowUS
3rd rowUS
4th rowUS
5th rowUS
ValueCountFrequency (%)
us 845741
37.3%
mx 117278
 
5.2%
br 95213
 
4.2%
ph 68631
 
3.0%
co 59012
 
2.6%
ca 50585
 
2.2%
pa 48976
 
2.2%
ve 43923
 
1.9%
cn 40185
 
1.8%
pe 39643
 
1.7%
Other values (237) 856977
37.8%
2025-01-08T17:46:32.674806image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
U 922314
20.3%
S 901987
19.9%
C 277772
 
6.1%
P 257294
 
5.7%
M 212476
 
4.7%
R 196295
 
4.3%
A 194321
 
4.3%
B 172654
 
3.8%
E 158329
 
3.5%
H 122627
 
2.7%
Other values (16) 1116259
24.6%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 4532328
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
U 922314
20.3%
S 901987
19.9%
C 277772
 
6.1%
P 257294
 
5.7%
M 212476
 
4.7%
R 196295
 
4.3%
A 194321
 
4.3%
B 172654
 
3.8%
E 158329
 
3.5%
H 122627
 
2.7%
Other values (16) 1116259
24.6%

Most occurring scripts

ValueCountFrequency (%)
Latin 4532328
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
U 922314
20.3%
S 901987
19.9%
C 277772
 
6.1%
P 257294
 
5.7%
M 212476
 
4.7%
R 196295
 
4.3%
A 194321
 
4.3%
B 172654
 
3.8%
E 158329
 
3.5%
H 122627
 
2.7%
Other values (16) 1116259
24.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4532328
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
U 922314
20.3%
S 901987
19.9%
C 277772
 
6.1%
P 257294
 
5.7%
M 212476
 
4.7%
R 196295
 
4.3%
A 194321
 
4.3%
B 172654
 
3.8%
E 158329
 
3.5%
H 122627
 
2.7%
Other values (16) 1116259
24.6%

stateProvince
Text

Missing 

Distinct7056
Distinct (%)0.4%
Missing637065
Missing (%)27.0%
Memory size18.0 MiB
2025-01-08T17:46:32.855491image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length69
Median length52
Mean length9.275850611
Min length1

Characters and Unicode

Total characters15995351
Distinct characters150
Distinct categories11 ?
Distinct scripts2 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1731 ?
Unique (%)0.1%

Sample

1st rowTennessee
2nd rowWest Virginia
3rd rowGeorgia
4th rowMaine
5th rowTexas
ValueCountFrequency (%)
california 92210
 
4.0%
florida 79202
 
3.5%
virginia 63444
 
2.8%
carolina 49684
 
2.2%
new 49642
 
2.2%
north 41844
 
1.8%
texas 40438
 
1.8%
alaska 39419
 
1.7%
massachusetts 36351
 
1.6%
maryland 30762
 
1.3%
Other values (5148) 1769153
77.2%
2025-01-08T17:46:33.118188image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 2411525
15.1%
i 1368936
 
8.6%
n 1172665
 
7.3%
o 1168862
 
7.3%
r 1039411
 
6.5%
e 843437
 
5.3%
s 754678
 
4.7%
l 682299
 
4.3%
t 604620
 
3.8%
567741
 
3.5%
Other values (140) 5381177
33.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 13046442
81.6%
Uppercase Letter 2281249
 
14.3%
Space Separator 567741
 
3.5%
Dash Punctuation 43849
 
0.3%
Other Punctuation 29327
 
0.2%
Open Punctuation 13313
 
0.1%
Close Punctuation 13311
 
0.1%
Math Symbol 70
 
< 0.1%
Decimal Number 27
 
< 0.1%
Modifier Letter 21
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 2411525
18.5%
i 1368936
10.5%
n 1172665
9.0%
o 1168862
9.0%
r 1039411
8.0%
e 843437
 
6.5%
s 754678
 
5.8%
l 682299
 
5.2%
t 604620
 
4.6%
u 477899
 
3.7%
Other values (72) 2522110
19.3%
Uppercase Letter
ValueCountFrequency (%)
C 328285
14.4%
M 227765
 
10.0%
N 175545
 
7.7%
S 173673
 
7.6%
A 166763
 
7.3%
P 129861
 
5.7%
T 113347
 
5.0%
V 100836
 
4.4%
F 96243
 
4.2%
B 78697
 
3.4%
Other values (33) 690234
30.3%
Other Punctuation
ValueCountFrequency (%)
. 19409
66.2%
/ 3917
 
13.4%
' 3030
 
10.3%
, 2241
 
7.6%
? 702
 
2.4%
& 27
 
0.1%
* 1
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
3 16
59.3%
4 3
 
11.1%
2 2
 
7.4%
8 2
 
7.4%
9 2
 
7.4%
6 1
 
3.7%
7 1
 
3.7%
Dash Punctuation
ValueCountFrequency (%)
- 43838
> 99.9%
11
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
[ 7637
57.4%
( 5676
42.6%
Close Punctuation
ValueCountFrequency (%)
] 7636
57.4%
) 5675
42.6%
Math Symbol
ValueCountFrequency (%)
= 68
97.1%
+ 2
 
2.9%
Space Separator
ValueCountFrequency (%)
567741
100.0%
Modifier Letter
ValueCountFrequency (%)
ʼ 21
100.0%
Modifier Symbol
ValueCountFrequency (%)
¸ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 15327691
95.8%
Common 667660
 
4.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 2411525
15.7%
i 1368936
 
8.9%
n 1172665
 
7.7%
o 1168862
 
7.6%
r 1039411
 
6.8%
e 843437
 
5.5%
s 754678
 
4.9%
l 682299
 
4.5%
t 604620
 
3.9%
u 477899
 
3.1%
Other values (115) 4803359
31.3%
Common
ValueCountFrequency (%)
567741
85.0%
- 43838
 
6.6%
. 19409
 
2.9%
[ 7637
 
1.1%
] 7636
 
1.1%
( 5676
 
0.9%
) 5675
 
0.8%
/ 3917
 
0.6%
' 3030
 
0.5%
, 2241
 
0.3%
Other values (15) 860
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 15889050
99.3%
None 106241
 
0.7%
Latin Ext Additional 28
 
< 0.1%
Modifier Letters 21
 
< 0.1%
Punctuation 11
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 2411525
15.2%
i 1368936
 
8.6%
n 1172665
 
7.4%
o 1168862
 
7.4%
r 1039411
 
6.5%
e 843437
 
5.3%
s 754678
 
4.7%
l 682299
 
4.3%
t 604620
 
3.8%
567741
 
3.6%
Other values (64) 5274876
33.2%
None
ValueCountFrequency (%)
á 37610
35.4%
í 21537
20.3%
é 17350
16.3%
ó 12736
 
12.0%
ã 6488
 
6.1%
ô 3458
 
3.3%
ñ 1592
 
1.5%
ü 1325
 
1.2%
ä 729
 
0.7%
å 578
 
0.5%
Other values (53) 2838
 
2.7%
Modifier Letters
ValueCountFrequency (%)
ʼ 21
100.0%
Punctuation
ValueCountFrequency (%)
11
100.0%
Latin Ext Additional
ValueCountFrequency (%)
8
28.6%
ế 3
 
10.7%
3
 
10.7%
3
 
10.7%
3
 
10.7%
2
 
7.1%
2
 
7.1%
1
 
3.6%
1
 
3.6%
1
 
3.6%

county
Text

Missing 

Distinct13641
Distinct (%)2.5%
Missing1825433
Missing (%)77.3%
Memory size18.0 MiB
2025-01-08T17:46:33.299832image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length56
Median length45
Mean length10.24671853
Min length1

Characters and Unicode

Total characters5492651
Distinct characters127
Distinct categories11 ?
Distinct scripts2 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4083 ?
Unique (%)0.8%

Sample

1st rowRandolph
2nd rowDecatur County
3rd rowPenobscot
4th rowGalveston County
5th rowDona Ana
ValueCountFrequency (%)
county 89628
 
10.8%
not 33487
 
4.0%
stated 33487
 
4.0%
san 13117
 
1.6%
prince 8967
 
1.1%
montgomery 8280
 
1.0%
district 8120
 
1.0%
santa 7340
 
0.9%
honolulu 7298
 
0.9%
6994
 
0.8%
Other values (9747) 611161
73.8%
2025-01-08T17:46:33.556114image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 521113
 
9.5%
o 447668
 
8.2%
n 426872
 
7.8%
e 416440
 
7.6%
t 377433
 
6.9%
r 296405
 
5.4%
291839
 
5.3%
i 275803
 
5.0%
u 235367
 
4.3%
l 211481
 
3.9%
Other values (117) 1992230
36.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4288557
78.1%
Uppercase Letter 816059
 
14.9%
Space Separator 291839
 
5.3%
Open Punctuation 35866
 
0.7%
Close Punctuation 35855
 
0.7%
Other Punctuation 13360
 
0.2%
Dash Punctuation 11015
 
0.2%
Decimal Number 42
 
< 0.1%
Modifier Symbol 34
 
< 0.1%
Math Symbol 19
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 521113
12.2%
o 447668
10.4%
n 426872
10.0%
e 416440
9.7%
t 377433
8.8%
r 296405
 
6.9%
i 275803
 
6.4%
u 235367
 
5.5%
l 211481
 
4.9%
s 188065
 
4.4%
Other values (50) 891910
20.8%
Uppercase Letter
ValueCountFrequency (%)
C 159354
19.5%
S 97656
12.0%
M 66313
 
8.1%
N 51844
 
6.4%
B 47582
 
5.8%
P 47574
 
5.8%
A 39007
 
4.8%
L 32820
 
4.0%
H 32647
 
4.0%
D 32163
 
3.9%
Other values (30) 209099
25.6%
Other Punctuation
ValueCountFrequency (%)
' 7093
53.1%
. 4324
32.4%
/ 1402
 
10.5%
? 289
 
2.2%
, 210
 
1.6%
* 25
 
0.2%
& 12
 
0.1%
¡ 2
 
< 0.1%
; 1
 
< 0.1%
\ 1
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 17
40.5%
2 15
35.7%
0 7
16.7%
4 2
 
4.8%
6 1
 
2.4%
Open Punctuation
ValueCountFrequency (%)
[ 33508
93.4%
( 2358
 
6.6%
Close Punctuation
ValueCountFrequency (%)
] 33498
93.4%
) 2357
 
6.6%
Dash Punctuation
ValueCountFrequency (%)
- 10890
98.9%
125
 
1.1%
Math Symbol
ValueCountFrequency (%)
= 18
94.7%
+ 1
 
5.3%
Space Separator
ValueCountFrequency (%)
291839
100.0%
Modifier Symbol
ValueCountFrequency (%)
´ 34
100.0%
Modifier Letter
ValueCountFrequency (%)
ʻ 5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 5104616
92.9%
Common 388035
 
7.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 521113
 
10.2%
o 447668
 
8.8%
n 426872
 
8.4%
e 416440
 
8.2%
t 377433
 
7.4%
r 296405
 
5.8%
i 275803
 
5.4%
u 235367
 
4.6%
l 211481
 
4.1%
s 188065
 
3.7%
Other values (90) 1707969
33.5%
Common
ValueCountFrequency (%)
291839
75.2%
[ 33508
 
8.6%
] 33498
 
8.6%
- 10890
 
2.8%
' 7093
 
1.8%
. 4324
 
1.1%
( 2358
 
0.6%
) 2357
 
0.6%
/ 1402
 
0.4%
? 289
 
0.1%
Other values (17) 477
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5474783
99.7%
None 17734
 
0.3%
Punctuation 125
 
< 0.1%
Modifier Letters 5
 
< 0.1%
Latin Ext Additional 4
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 521113
 
9.5%
o 447668
 
8.2%
n 426872
 
7.8%
e 416440
 
7.6%
t 377433
 
6.9%
r 296405
 
5.4%
291839
 
5.3%
i 275803
 
5.0%
u 235367
 
4.3%
l 211481
 
3.9%
Other values (65) 1974362
36.1%
None
ValueCountFrequency (%)
á 3810
21.5%
é 3219
18.2%
í 2995
16.9%
ó 2587
14.6%
ã 1808
10.2%
ç 1124
 
6.3%
ô 367
 
2.1%
è 364
 
2.1%
ñ 315
 
1.8%
ü 299
 
1.7%
Other values (38) 846
 
4.8%
Punctuation
ValueCountFrequency (%)
125
100.0%
Modifier Letters
ValueCountFrequency (%)
ʻ 5
100.0%
Latin Ext Additional
ValueCountFrequency (%)
3
75.0%
1
 
25.0%

municipality
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing2361472
Missing (%)> 99.9%
Memory size18.0 MiB
2025-01-08T17:46:33.608116image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length6
Median length6
Mean length6
Min length6

Characters and Unicode

Total characters6
Distinct characters4
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row-53.33
ValueCountFrequency (%)
53.33 1
100.0%
2025-01-08T17:46:33.692554image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3 3
50.0%
- 1
 
16.7%
5 1
 
16.7%
. 1
 
16.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4
66.7%
Dash Punctuation 1
 
16.7%
Other Punctuation 1
 
16.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 3
75.0%
5 1
 
25.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 6
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
3 3
50.0%
- 1
 
16.7%
5 1
 
16.7%
. 1
 
16.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3 3
50.0%
- 1
 
16.7%
5 1
 
16.7%
. 1
 
16.7%

locality
Text

Missing 

Distinct924366
Distinct (%)45.7%
Missing337166
Missing (%)14.3%
Memory size18.0 MiB
2025-01-08T17:46:34.016300image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length220527
Median length381
Mean length40.58323318
Min length1

Characters and Unicode

Total characters82152923
Distinct characters327
Distinct categories21 ?
Distinct scripts4 ?
Distinct blocks14 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique736013 ?
Unique (%)36.4%

Sample

1st rowCarrie Bow Cay, Spur And Groove Zone
2nd rowEastern edge of Nashville, Davidson County.
3rd rowMonongahela National Forest, 1.2-1.4 mi (by road) W of Bear Heaven Campground, on road to Bickle Knob
4th rowHales Landing, Flint River about 7 miles below Bainbridge, basal Chattahoochee Formation, Oligocene, Vicksburgian
5th rowOrono
ValueCountFrequency (%)
of 676638
 
5.1%
de 173473
 
1.3%
island 171486
 
1.3%
km 144885
 
1.1%
on 127233
 
1.0%
near 121493
 
0.9%
the 114523
 
0.9%
road 113788
 
0.9%
mi 107866
 
0.8%
and 105679
 
0.8%
Other values (335687) 11406497
86.0%
2025-01-08T17:46:34.410972image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
11193478
 
13.6%
a 7469383
 
9.1%
e 5570223
 
6.8%
o 5407301
 
6.6%
n 4552644
 
5.5%
i 4136655
 
5.0%
r 3972806
 
4.8%
t 3638627
 
4.4%
l 2985176
 
3.6%
s 2875069
 
3.5%
Other values (317) 30351561
36.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 55766334
67.9%
Space Separator 11193478
 
13.6%
Uppercase Letter 9422464
 
11.5%
Other Punctuation 3689608
 
4.5%
Decimal Number 1341617
 
1.6%
Open Punctuation 187756
 
0.2%
Close Punctuation 187019
 
0.2%
Dash Punctuation 184291
 
0.2%
Control 148080
 
0.2%
Math Symbol 15175
 
< 0.1%
Other values (11) 17101
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 7469383
13.4%
e 5570223
10.0%
o 5407301
9.7%
n 4552644
 
8.2%
i 4136655
 
7.4%
r 3972806
 
7.1%
t 3638627
 
6.5%
l 2985176
 
5.4%
s 2875069
 
5.2%
u 2055702
 
3.7%
Other values (132) 13102748
23.5%
Uppercase Letter
ValueCountFrequency (%)
S 961889
 
10.2%
C 957641
 
10.2%
M 661223
 
7.0%
P 637053
 
6.8%
R 626599
 
6.7%
B 575723
 
6.1%
N 541498
 
5.7%
A 453780
 
4.8%
I 410717
 
4.4%
L 409160
 
4.3%
Other values (70) 3187181
33.8%
Other Punctuation
ValueCountFrequency (%)
, 1706617
46.3%
. 1642092
44.5%
: 131554
 
3.6%
; 81659
 
2.2%
' 58445
 
1.6%
" 28934
 
0.8%
/ 20650
 
0.6%
& 11584
 
0.3%
# 3572
 
0.1%
? 3483
 
0.1%
Other values (9) 1018
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 262130
19.5%
2 192764
14.4%
0 182124
13.6%
5 152353
11.4%
3 133850
10.0%
4 110344
8.2%
6 96303
 
7.2%
7 74994
 
5.6%
8 73246
 
5.5%
9 63509
 
4.7%
Control
ValueCountFrequency (%)
147366
99.5%
665
 
0.4%
 16
 
< 0.1%
 11
 
< 0.1%
 9
 
< 0.1%
 8
 
< 0.1%
 2
 
< 0.1%
 2
 
< 0.1%
 1
 
< 0.1%
Math Symbol
ValueCountFrequency (%)
= 10268
67.7%
+ 2352
 
15.5%
± 1347
 
8.9%
~ 419
 
2.8%
> 417
 
2.7%
< 329
 
2.2%
| 35
 
0.2%
5
 
< 0.1%
3
 
< 0.1%
Other Number
ValueCountFrequency (%)
½ 3259
66.8%
¼ 1386
28.4%
¾ 192
 
3.9%
² 22
 
0.5%
18
 
0.4%
2
 
< 0.1%
³ 2
 
< 0.1%
1
 
< 0.1%
Other Symbol
ValueCountFrequency (%)
° 2020
99.3%
6
 
0.3%
3
 
0.1%
3
 
0.1%
1
 
< 0.1%
© 1
 
< 0.1%
Format
ValueCountFrequency (%)
­ 28
80.0%
2
 
5.7%
2
 
5.7%
1
 
2.9%
1
 
2.9%
1
 
2.9%
Open Punctuation
ValueCountFrequency (%)
( 141352
75.3%
[ 46198
 
24.6%
99
 
0.1%
{ 54
 
< 0.1%
53
 
< 0.1%
Modifier Letter
ValueCountFrequency (%)
ʻ 122
96.1%
2
 
1.6%
1
 
0.8%
1
 
0.8%
1
 
0.8%
Currency Symbol
ValueCountFrequency (%)
¢ 39
61.9%
¤ 17
27.0%
£ 3
 
4.8%
$ 3
 
4.8%
¥ 1
 
1.6%
Nonspacing Mark
ValueCountFrequency (%)
̄ 2
33.3%
̈ 2
33.3%
1
16.7%
̌ 1
16.7%
Dash Punctuation
ValueCountFrequency (%)
- 184274
> 99.9%
11
 
< 0.1%
6
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
) 140703
75.2%
] 46254
 
24.7%
} 62
 
< 0.1%
Final Punctuation
ValueCountFrequency (%)
» 205
90.7%
15
 
6.6%
6
 
2.7%
Initial Punctuation
ValueCountFrequency (%)
« 201
84.1%
37
 
15.5%
1
 
0.4%
Modifier Symbol
ValueCountFrequency (%)
´ 136
90.7%
¨ 9
 
6.0%
^ 5
 
3.3%
Other Letter
ValueCountFrequency (%)
º 859
98.1%
ª 17
 
1.9%
Space Separator
ValueCountFrequency (%)
11193478
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 8463
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 65189652
79.4%
Common 16963237
 
20.6%
Greek 27
 
< 0.1%
Inherited 7
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 7469383
 
11.5%
e 5570223
 
8.5%
o 5407301
 
8.3%
n 4552644
 
7.0%
i 4136655
 
6.3%
r 3972806
 
6.1%
t 3638627
 
5.6%
l 2985176
 
4.6%
s 2875069
 
4.4%
u 2055702
 
3.2%
Other values (208) 22526066
34.6%
Common
ValueCountFrequency (%)
11193478
66.0%
, 1706617
 
10.1%
. 1642092
 
9.7%
1 262130
 
1.5%
2 192764
 
1.1%
- 184274
 
1.1%
0 182124
 
1.1%
5 152353
 
0.9%
147366
 
0.9%
( 141352
 
0.8%
Other values (84) 1158687
 
6.8%
Greek
ValueCountFrequency (%)
λ 6
22.2%
ν 5
18.5%
Κ 3
11.1%
υ 3
11.1%
ή 3
11.1%
η 3
11.1%
ω 1
 
3.7%
ρ 1
 
3.7%
Π 1
 
3.7%
ά 1
 
3.7%
Inherited
ValueCountFrequency (%)
̄ 2
28.6%
̈ 2
28.6%
1
14.3%
1
14.3%
̌ 1
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 81912693
99.7%
None 239779
 
0.3%
Punctuation 266
 
< 0.1%
Modifier Letters 122
 
< 0.1%
Number Forms 21
 
< 0.1%
Box Drawing 9
 
< 0.1%
Latin Ext Additional 8
 
< 0.1%
Arrows 5
 
< 0.1%
Diacriticals 5
 
< 0.1%
Phonetic Ext 5
 
< 0.1%
Other values (4) 10
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
11193478
 
13.7%
a 7469383
 
9.1%
e 5570223
 
6.8%
o 5407301
 
6.6%
n 4552644
 
5.6%
i 4136655
 
5.1%
r 3972806
 
4.9%
t 3638627
 
4.4%
l 2985176
 
3.6%
s 2875069
 
3.5%
Other values (88) 30111331
36.8%
None
ValueCountFrequency (%)
í 59545
24.8%
á 42977
17.9%
é 29020
12.1%
ó 24065
10.0%
ñ 12011
 
5.0%
ã 9499
 
4.0%
ú 6798
 
2.8%
ç 5940
 
2.5%
ü 4680
 
2.0%
ä 4359
 
1.8%
Other values (178) 40885
17.1%
Modifier Letters
ValueCountFrequency (%)
ʻ 122
100.0%
Punctuation
ValueCountFrequency (%)
99
37.2%
53
19.9%
37
 
13.9%
30
 
11.3%
15
 
5.6%
11
 
4.1%
6
 
2.3%
6
 
2.3%
2
 
0.8%
2
 
0.8%
Other values (5) 5
 
1.9%
Number Forms
ValueCountFrequency (%)
18
85.7%
2
 
9.5%
1
 
4.8%
Box Drawing
ValueCountFrequency (%)
6
66.7%
3
33.3%
Arrows
ValueCountFrequency (%)
5
100.0%
Block Elements
ValueCountFrequency (%)
3
75.0%
1
 
25.0%
Math Operators
ValueCountFrequency (%)
3
100.0%
IPA Ext
ValueCountFrequency (%)
ɶ 2
100.0%
Diacriticals
ValueCountFrequency (%)
̄ 2
40.0%
̈ 2
40.0%
̌ 1
20.0%
Phonetic Ext
ValueCountFrequency (%)
2
40.0%
1
20.0%
1
20.0%
1
20.0%
Latin Ext Additional
ValueCountFrequency (%)
ḿ 2
25.0%
1
12.5%
1
12.5%
ế 1
12.5%
1
12.5%
1
12.5%
1
12.5%
Diacriticals Sup
ValueCountFrequency (%)
1
100.0%

verbatimElevation
Text

Missing 

Distinct2886
Distinct (%)4.2%
Missing2293088
Missing (%)97.1%
Memory size18.0 MiB
2025-01-08T17:46:34.589789image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length152
Median length124
Mean length7.501996052
Min length1

Characters and Unicode

Total characters513024
Distinct characters77
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique752 ?
Unique (%)1.1%

Sample

1st row3600 (3440-3760) ft
2nd row~1800 ft.
3rd row80 ft
4th row160 m
5th row150 m
ValueCountFrequency (%)
ft 49451
34.2%
m 16017
 
11.1%
ca 3567
 
2.5%
feet 1112
 
0.8%
200 1103
 
0.8%
1100-1350 1002
 
0.7%
10 898
 
0.6%
20 771
 
0.5%
3400 723
 
0.5%
3500 707
 
0.5%
Other values (1929) 69115
47.8%
2025-01-08T17:46:34.836940image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 102042
19.9%
76081
14.8%
t 52717
10.3%
f 51306
10.0%
1 26577
 
5.2%
3 25497
 
5.0%
2 24429
 
4.8%
4 22152
 
4.3%
5 20591
 
4.0%
m 17094
 
3.3%
Other values (67) 94538
18.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 266424
51.9%
Lowercase Letter 154716
30.2%
Space Separator 76081
 
14.8%
Dash Punctuation 8084
 
1.6%
Other Punctuation 5435
 
1.1%
Uppercase Letter 1378
 
0.3%
Open Punctuation 398
 
0.1%
Close Punctuation 398
 
0.1%
Math Symbol 110
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 52717
34.1%
f 51306
33.2%
m 17094
 
11.0%
e 7739
 
5.0%
a 6180
 
4.0%
c 4297
 
2.8%
s 2575
 
1.7%
l 2322
 
1.5%
o 2060
 
1.3%
r 1687
 
1.1%
Other values (15) 6739
 
4.4%
Uppercase Letter
ValueCountFrequency (%)
D 402
29.2%
T 181
13.1%
P 145
 
10.5%
W 140
 
10.2%
A 108
 
7.8%
R 107
 
7.8%
C 61
 
4.4%
N 40
 
2.9%
G 32
 
2.3%
S 21
 
1.5%
Other values (12) 141
 
10.2%
Decimal Number
ValueCountFrequency (%)
0 102042
38.3%
1 26577
 
10.0%
3 25497
 
9.6%
2 24429
 
9.2%
4 22152
 
8.3%
5 20591
 
7.7%
6 15483
 
5.8%
8 12050
 
4.5%
7 10078
 
3.8%
9 7525
 
2.8%
Other Punctuation
ValueCountFrequency (%)
. 4628
85.2%
: 402
 
7.4%
' 167
 
3.1%
, 153
 
2.8%
" 32
 
0.6%
? 28
 
0.5%
; 18
 
0.3%
/ 4
 
0.1%
& 3
 
0.1%
Math Symbol
ValueCountFrequency (%)
< 54
49.1%
+ 19
 
17.3%
= 18
 
16.4%
> 10
 
9.1%
~ 9
 
8.2%
Open Punctuation
ValueCountFrequency (%)
( 361
90.7%
[ 37
 
9.3%
Close Punctuation
ValueCountFrequency (%)
) 361
90.7%
] 37
 
9.3%
Space Separator
ValueCountFrequency (%)
76081
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 8084
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 356930
69.6%
Latin 156094
30.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 52717
33.8%
f 51306
32.9%
m 17094
 
11.0%
e 7739
 
5.0%
a 6180
 
4.0%
c 4297
 
2.8%
s 2575
 
1.6%
l 2322
 
1.5%
o 2060
 
1.3%
r 1687
 
1.1%
Other values (37) 8117
 
5.2%
Common
ValueCountFrequency (%)
0 102042
28.6%
76081
21.3%
1 26577
 
7.4%
3 25497
 
7.1%
2 24429
 
6.8%
4 22152
 
6.2%
5 20591
 
5.8%
6 15483
 
4.3%
8 12050
 
3.4%
7 10078
 
2.8%
Other values (20) 21950
 
6.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 513024
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 102042
19.9%
76081
14.8%
t 52717
10.3%
f 51306
10.0%
1 26577
 
5.2%
3 25497
 
5.0%
2 24429
 
4.8%
4 22152
 
4.3%
5 20591
 
4.0%
m 17094
 
3.3%
Other values (67) 94538
18.4%

verbatimDepth
Text

Missing 

Distinct853
Distinct (%)5.9%
Missing2347005
Missing (%)99.4%
Memory size18.0 MiB
2025-01-08T17:46:35.126908image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length232132
Median length91
Mean length24.7726016
Min length1

Characters and Unicode

Total characters358410
Distinct characters103
Distinct categories11 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique441 ?
Unique (%)3.0%

Sample

1st rowLittoral
2nd row00000000, 00000013
3rd rowpenetration depth: 15cm
4th row1 ms ca.
5th rowIntertidal
ValueCountFrequency (%)
ca 6547
 
15.4%
intertidal 3133
 
7.4%
surface 1656
 
3.9%
recorded 744
 
1.7%
depths 742
 
1.7%
multiple 737
 
1.7%
false 567
 
1.3%
depth 504
 
1.2%
us 503
 
1.2%
1 349
 
0.8%
Other values (5512) 27047
63.6%
2025-01-08T17:46:35.360165image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
41247
 
11.5%
a 23342
 
6.5%
e 18168
 
5.1%
t 16885
 
4.7%
15290
 
4.3%
r 12513
 
3.5%
c 12484
 
3.5%
i 12429
 
3.5%
0 11460
 
3.2%
l 11180
 
3.1%
Other values (93) 183412
51.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 170824
47.7%
Decimal Number 57424
 
16.0%
Uppercase Letter 47386
 
13.2%
Control 41433
 
11.6%
Other Punctuation 17500
 
4.9%
Space Separator 15290
 
4.3%
Dash Punctuation 5657
 
1.6%
Connector Punctuation 2318
 
0.6%
Open Punctuation 257
 
0.1%
Close Punctuation 254
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 23342
13.7%
e 18168
10.6%
t 16885
9.9%
r 12513
 
7.3%
c 12484
 
7.3%
i 12429
 
7.3%
l 11180
 
6.5%
d 9861
 
5.8%
n 9804
 
5.7%
o 8826
 
5.2%
Other values (31) 35332
20.7%
Uppercase Letter
ValueCountFrequency (%)
I 5324
11.2%
S 5076
10.7%
E 4373
9.2%
C 4137
 
8.7%
A 3839
 
8.1%
N 3365
 
7.1%
M 3220
 
6.8%
T 2786
 
5.9%
R 2663
 
5.6%
U 1879
 
4.0%
Other values (17) 10724
22.6%
Other Punctuation
ValueCountFrequency (%)
. 8114
46.4%
: 4020
23.0%
, 3576
20.4%
/ 997
 
5.7%
; 277
 
1.6%
" 233
 
1.3%
' 179
 
1.0%
& 85
 
0.5%
@ 10
 
0.1%
? 5
 
< 0.1%
Other values (2) 4
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
0 11460
20.0%
1 8175
14.2%
2 7952
13.8%
3 5184
9.0%
4 4941
8.6%
5 4328
 
7.5%
8 4216
 
7.3%
6 3855
 
6.7%
7 3699
 
6.4%
9 3614
 
6.3%
Math Symbol
ValueCountFrequency (%)
< 34
50.7%
= 17
25.4%
+ 10
 
14.9%
~ 6
 
9.0%
Control
ValueCountFrequency (%)
41247
99.6%
186
 
0.4%
Open Punctuation
ValueCountFrequency (%)
( 252
98.1%
[ 5
 
1.9%
Close Punctuation
ValueCountFrequency (%)
) 249
98.0%
] 5
 
2.0%
Space Separator
ValueCountFrequency (%)
15290
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 5657
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 2318
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 218210
60.9%
Common 140200
39.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 23342
 
10.7%
e 18168
 
8.3%
t 16885
 
7.7%
r 12513
 
5.7%
c 12484
 
5.7%
i 12429
 
5.7%
l 11180
 
5.1%
d 9861
 
4.5%
n 9804
 
4.5%
o 8826
 
4.0%
Other values (58) 82718
37.9%
Common
ValueCountFrequency (%)
41247
29.4%
15290
 
10.9%
0 11460
 
8.2%
1 8175
 
5.8%
. 8114
 
5.8%
2 7952
 
5.7%
- 5657
 
4.0%
3 5184
 
3.7%
4 4941
 
3.5%
5 4328
 
3.1%
Other values (25) 27852
19.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 358321
> 99.9%
None 89
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
41247
 
11.5%
a 23342
 
6.5%
e 18168
 
5.1%
t 16885
 
4.7%
15290
 
4.3%
r 12513
 
3.5%
c 12484
 
3.5%
i 12429
 
3.5%
0 11460
 
3.2%
l 11180
 
3.1%
Other values (77) 183323
51.2%
None
ValueCountFrequency (%)
é 16
18.0%
í 14
15.7%
ó 10
11.2%
á 10
11.2%
ü 8
9.0%
ô 6
 
6.7%
ö 5
 
5.6%
ã 4
 
4.5%
ñ 3
 
3.4%
ä 3
 
3.4%
Other values (6) 10
11.2%

decimalLatitude
Text

Missing 

Distinct97854
Distinct (%)13.7%
Missing1649765
Missing (%)69.9%
Memory size18.0 MiB
2025-01-08T17:46:35.570561image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length84
Median length11
Mean length6.194806016
Min length3

Characters and Unicode

Total characters4408893
Distinct characters37
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique44295 ?
Unique (%)6.2%

Sample

1st row16.8033
2nd row38.9361
3rd row29.2483
4th row44.8831
5th row29.2586
ValueCountFrequency (%)
25.58 2629
 
0.4%
40.6583 2215
 
0.3%
26.17 1853
 
0.3%
26.5 1352
 
0.2%
39.6891 1261
 
0.2%
38.9694 1127
 
0.2%
39.6306 1069
 
0.2%
26.97 1018
 
0.1%
38.895 1015
 
0.1%
60.75 991
 
0.1%
Other values (91110) 697189
98.0%
2025-01-08T17:46:35.847647image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 711708
16.1%
3 572272
13.0%
2 393297
8.9%
1 382548
8.7%
5 353375
8.0%
8 347442
7.9%
7 340351
7.7%
4 332050
7.5%
6 327777
7.4%
9 278204
 
6.3%
Other values (27) 369869
8.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3558864
80.7%
Other Punctuation 711712
 
16.1%
Dash Punctuation 138221
 
3.1%
Lowercase Letter 57
 
< 0.1%
Uppercase Letter 28
 
< 0.1%
Space Separator 11
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 8
14.0%
s 7
12.3%
a 6
10.5%
t 5
8.8%
n 5
8.8%
i 5
8.8%
d 4
7.0%
r 4
7.0%
u 3
 
5.3%
l 2
 
3.5%
Other values (6) 8
14.0%
Decimal Number
ValueCountFrequency (%)
3 572272
16.1%
2 393297
11.1%
1 382548
10.7%
5 353375
9.9%
8 347442
9.8%
7 340351
9.6%
4 332050
9.3%
6 327777
9.2%
9 278204
7.8%
0 231548
6.5%
Uppercase Letter
ValueCountFrequency (%)
E 18
64.3%
A 3
 
10.7%
L 2
 
7.1%
I 2
 
7.1%
B 1
 
3.6%
N 1
 
3.6%
W 1
 
3.6%
Other Punctuation
ValueCountFrequency (%)
. 711708
> 99.9%
, 4
 
< 0.1%
Dash Punctuation
ValueCountFrequency (%)
- 138221
100.0%
Space Separator
ValueCountFrequency (%)
11
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 4408808
> 99.9%
Latin 85
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 18
21.2%
e 8
 
9.4%
s 7
 
8.2%
a 6
 
7.1%
t 5
 
5.9%
n 5
 
5.9%
i 5
 
5.9%
d 4
 
4.7%
r 4
 
4.7%
A 3
 
3.5%
Other values (13) 20
23.5%
Common
ValueCountFrequency (%)
. 711708
16.1%
3 572272
13.0%
2 393297
8.9%
1 382548
8.7%
5 353375
8.0%
8 347442
7.9%
7 340351
7.7%
4 332050
7.5%
6 327777
7.4%
9 278204
 
6.3%
Other values (4) 369784
8.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4408893
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 711708
16.1%
3 572272
13.0%
2 393297
8.9%
1 382548
8.7%
5 353375
8.0%
8 347442
7.9%
7 340351
7.7%
4 332050
7.5%
6 327777
7.4%
9 278204
 
6.3%
Other values (27) 369869
8.4%

decimalLongitude
Text

Missing 

Distinct102754
Distinct (%)14.4%
Missing1649765
Missing (%)69.9%
Memory size18.0 MiB
2025-01-08T17:46:36.066120image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length12
Mean length7.06795343
Min length3

Characters and Unicode

Total characters5030319
Distinct characters23
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique44520 ?
Unique (%)6.3%

Sample

1st row-88.0767
2nd row-79.6908
3rd row-88.1214
4th row-68.672
5th row-94.9533
ValueCountFrequency (%)
80.1 2655
 
0.4%
105.644 1281
 
0.2%
127.848 1115
 
0.2%
88.08 1095
 
0.2%
77.4714 1069
 
0.2%
67.7683 1046
 
0.1%
139.5 995
 
0.1%
77.0367 986
 
0.1%
80.13 980
 
0.1%
77.1767 933
 
0.1%
Other values (95547) 699553
98.3%
2025-01-08T17:46:36.331859image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 711707
14.1%
- 580730
11.5%
7 529418
10.5%
1 483727
9.6%
8 448602
8.9%
6 391174
7.8%
3 376072
7.5%
5 343881
6.8%
2 329551
6.6%
9 300995
6.0%
Other values (13) 534462
10.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3737869
74.3%
Other Punctuation 711707
 
14.1%
Dash Punctuation 580730
 
11.5%
Uppercase Letter 12
 
< 0.1%
Connector Punctuation 1
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
7 529418
14.2%
1 483727
12.9%
8 448602
12.0%
6 391174
10.5%
3 376072
10.1%
5 343881
9.2%
2 329551
8.8%
9 300995
8.1%
4 273631
7.3%
0 260818
7.0%
Uppercase Letter
ValueCountFrequency (%)
R 2
16.7%
A 2
16.7%
N 1
8.3%
O 1
8.3%
T 1
8.3%
H 1
8.3%
M 1
8.3%
E 1
8.3%
I 1
8.3%
C 1
8.3%
Other Punctuation
ValueCountFrequency (%)
. 711707
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 580730
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 5030307
> 99.9%
Latin 12
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
. 711707
14.1%
- 580730
11.5%
7 529418
10.5%
1 483727
9.6%
8 448602
8.9%
6 391174
7.8%
3 376072
7.5%
5 343881
6.8%
2 329551
6.6%
9 300995
6.0%
Other values (3) 534450
10.6%
Latin
ValueCountFrequency (%)
R 2
16.7%
A 2
16.7%
N 1
8.3%
O 1
8.3%
T 1
8.3%
H 1
8.3%
M 1
8.3%
E 1
8.3%
I 1
8.3%
C 1
8.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5030319
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 711707
14.1%
- 580730
11.5%
7 529418
10.5%
1 483727
9.6%
8 448602
8.9%
6 391174
7.8%
3 376072
7.5%
5 343881
6.8%
2 329551
6.6%
9 300995
6.0%
Other values (13) 534462
10.6%
Distinct5438
Distinct (%)12.6%
Missing2318351
Missing (%)98.2%
Memory size18.0 MiB
2025-01-08T17:46:36.538301image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length9
Median length8
Mean length6.314085618
Min length3

Characters and Unicode

Total characters272276
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2012 ?
Unique (%)4.7%

Sample

1st row401.57
2nd row3246.0
3rd row3429.51
4th row801.57
5th row4233.0
ValueCountFrequency (%)
3036.0 447
 
1.0%
100.0 377
 
0.9%
347.62 374
 
0.9%
500.0 363
 
0.8%
16000.0 330
 
0.8%
186.68 323
 
0.7%
1000.0 321
 
0.7%
4615.0 287
 
0.7%
1066.0 266
 
0.6%
5615.0 259
 
0.6%
Other values (5428) 39775
92.2%
2025-01-08T17:46:36.791296image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 43122
15.8%
0 42716
15.7%
1 31279
11.5%
2 23189
8.5%
5 21977
8.1%
3 21908
8.0%
4 20527
7.5%
6 19080
7.0%
9 16598
 
6.1%
8 15972
 
5.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 229154
84.2%
Other Punctuation 43122
 
15.8%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 42716
18.6%
1 31279
13.6%
2 23189
10.1%
5 21977
9.6%
3 21908
9.6%
4 20527
9.0%
6 19080
8.3%
9 16598
 
7.2%
8 15972
 
7.0%
7 15908
 
6.9%
Other Punctuation
ValueCountFrequency (%)
. 43122
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 272276
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
. 43122
15.8%
0 42716
15.7%
1 31279
11.5%
2 23189
8.5%
5 21977
8.1%
3 21908
8.0%
4 20527
7.5%
6 19080
7.0%
9 16598
 
6.1%
8 15972
 
5.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 272276
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 43122
15.8%
0 42716
15.7%
1 31279
11.5%
2 23189
8.5%
5 21977
8.1%
3 21908
8.0%
4 20527
7.5%
6 19080
7.0%
9 16598
 
6.1%
8 15972
 
5.9%

coordinatePrecision
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing2361472
Missing (%)> 99.9%
Memory size18.0 MiB
2025-01-08T17:46:36.845388image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length11
Median length11
Mean length11
Min length11

Characters and Unicode

Total characters11
Distinct characters10
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowLeeward Is.
ValueCountFrequency (%)
leeward 1
50.0%
is 1
50.0%
2025-01-08T17:46:36.934005image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 2
18.2%
L 1
9.1%
w 1
9.1%
a 1
9.1%
r 1
9.1%
d 1
9.1%
1
9.1%
I 1
9.1%
s 1
9.1%
. 1
9.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 7
63.6%
Uppercase Letter 2
 
18.2%
Space Separator 1
 
9.1%
Other Punctuation 1
 
9.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 2
28.6%
w 1
14.3%
a 1
14.3%
r 1
14.3%
d 1
14.3%
s 1
14.3%
Uppercase Letter
ValueCountFrequency (%)
L 1
50.0%
I 1
50.0%
Space Separator
ValueCountFrequency (%)
1
100.0%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 9
81.8%
Common 2
 
18.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 2
22.2%
L 1
11.1%
w 1
11.1%
a 1
11.1%
r 1
11.1%
d 1
11.1%
I 1
11.1%
s 1
11.1%
Common
ValueCountFrequency (%)
1
50.0%
. 1
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 11
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 2
18.2%
L 1
9.1%
w 1
9.1%
a 1
9.1%
r 1
9.1%
d 1
9.1%
1
9.1%
I 1
9.1%
s 1
9.1%
. 1
9.1%

pointRadiusSpatialFit
Text

Missing 

Distinct3
Distinct (%)100.0%
Missing2361470
Missing (%)> 99.9%
Memory size18.0 MiB
2025-01-08T17:46:36.977005image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters21
Distinct characters15
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)100.0%

Sample

1st row3190721
2nd rowAntigua
3rd row3869031
ValueCountFrequency (%)
3190721 1
33.3%
antigua 1
33.3%
3869031 1
33.3%
2025-01-08T17:46:37.068388image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3 3
14.3%
1 3
14.3%
9 2
 
9.5%
0 2
 
9.5%
7 1
 
4.8%
2 1
 
4.8%
A 1
 
4.8%
n 1
 
4.8%
t 1
 
4.8%
i 1
 
4.8%
Other values (5) 5
23.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 14
66.7%
Lowercase Letter 6
28.6%
Uppercase Letter 1
 
4.8%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 3
21.4%
1 3
21.4%
9 2
14.3%
0 2
14.3%
7 1
 
7.1%
2 1
 
7.1%
8 1
 
7.1%
6 1
 
7.1%
Lowercase Letter
ValueCountFrequency (%)
n 1
16.7%
t 1
16.7%
i 1
16.7%
g 1
16.7%
u 1
16.7%
a 1
16.7%
Uppercase Letter
ValueCountFrequency (%)
A 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 14
66.7%
Latin 7
33.3%

Most frequent character per script

Common
ValueCountFrequency (%)
3 3
21.4%
1 3
21.4%
9 2
14.3%
0 2
14.3%
7 1
 
7.1%
2 1
 
7.1%
8 1
 
7.1%
6 1
 
7.1%
Latin
ValueCountFrequency (%)
A 1
14.3%
n 1
14.3%
t 1
14.3%
i 1
14.3%
g 1
14.3%
u 1
14.3%
a 1
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 21
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3 3
14.3%
1 3
14.3%
9 2
 
9.5%
0 2
 
9.5%
7 1
 
4.8%
2 1
 
4.8%
A 1
 
4.8%
n 1
 
4.8%
t 1
 
4.8%
i 1
 
4.8%
Other values (5) 5
23.8%
Distinct9
Distinct (%)< 0.1%
Missing2103318
Missing (%)89.1%
Memory size18.0 MiB
2025-01-08T17:46:37.117387image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length23
Median length23
Mean length22.71790204
Min length2

Characters and Unicode

Total characters5864740
Distinct characters29
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowDegrees Minutes Seconds
2nd rowDegrees Minutes Seconds
3rd rowDegrees Minutes Seconds
4th rowDegrees Minutes Seconds
5th rowDegrees Minutes Seconds
ValueCountFrequency (%)
degrees 255841
33.4%
minutes 249742
32.6%
seconds 249742
32.6%
decimal 6099
 
0.8%
township 1828
 
0.2%
range 1828
 
0.2%
utm 195
 
< 0.1%
marsden 143
 
< 0.1%
square 143
 
< 0.1%
unknown 140
 
< 0.1%
Other values (2) 8
 
< 0.1%
2025-01-08T17:46:37.214498image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 1275220
21.7%
s 757296
12.9%
507554
 
8.7%
n 503703
 
8.6%
g 257669
 
4.4%
i 257669
 
4.4%
r 256127
 
4.4%
d 255978
 
4.4%
D 255854
 
4.4%
c 255841
 
4.4%
Other values (19) 1281829
21.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4597158
78.4%
Uppercase Letter 760028
 
13.0%
Space Separator 507554
 
8.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 1275220
27.7%
s 757296
16.5%
n 503703
 
11.0%
g 257669
 
5.6%
i 257669
 
5.6%
r 256127
 
5.6%
d 255978
 
5.6%
c 255841
 
5.6%
o 251710
 
5.5%
u 249885
 
5.4%
Other values (9) 276060
 
6.0%
Uppercase Letter
ValueCountFrequency (%)
D 255854
33.7%
M 250080
32.9%
S 249885
32.9%
T 2023
 
0.3%
R 1828
 
0.2%
U 342
 
< 0.1%
A 8
 
< 0.1%
Q 7
 
< 0.1%
G 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
507554
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 5357186
91.3%
Common 507554
 
8.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 1275220
23.8%
s 757296
14.1%
n 503703
 
9.4%
g 257669
 
4.8%
i 257669
 
4.8%
r 256127
 
4.8%
d 255978
 
4.8%
D 255854
 
4.8%
c 255841
 
4.8%
o 251710
 
4.7%
Other values (18) 1030119
19.2%
Common
ValueCountFrequency (%)
507554
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5864740
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 1275220
21.7%
s 757296
12.9%
507554
 
8.7%
n 503703
 
8.6%
g 257669
 
4.4%
i 257669
 
4.4%
r 256127
 
4.4%
d 255978
 
4.4%
D 255854
 
4.4%
c 255841
 
4.4%
Other values (19) 1281829
21.9%

verbatimSRS
Text

Missing 

Distinct6
Distinct (%)100.0%
Missing2361467
Missing (%)> 99.9%
Memory size18.0 MiB
2025-01-08T17:46:37.264851image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length7
Mean length7
Min length4

Characters and Unicode

Total characters42
Distinct characters10
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6 ?
Unique (%)100.0%

Sample

1st row1961-01-09
2nd row2003-06-02
3rd row1955
4th row1911-08-27
5th row1907
ValueCountFrequency (%)
1961-01-09 1
16.7%
2003-06-02 1
16.7%
1955 1
16.7%
1911-08-27 1
16.7%
1907 1
16.7%
1876 1
16.7%
2025-01-08T17:46:37.370292image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 9
21.4%
0 8
19.0%
- 6
14.3%
9 5
11.9%
6 3
 
7.1%
2 3
 
7.1%
7 3
 
7.1%
5 2
 
4.8%
8 2
 
4.8%
3 1
 
2.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 36
85.7%
Dash Punctuation 6
 
14.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 9
25.0%
0 8
22.2%
9 5
13.9%
6 3
 
8.3%
2 3
 
8.3%
7 3
 
8.3%
5 2
 
5.6%
8 2
 
5.6%
3 1
 
2.8%
Dash Punctuation
ValueCountFrequency (%)
- 6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 42
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 9
21.4%
0 8
19.0%
- 6
14.3%
9 5
11.9%
6 3
 
7.1%
2 3
 
7.1%
7 3
 
7.1%
5 2
 
4.8%
8 2
 
4.8%
3 1
 
2.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 42
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 9
21.4%
0 8
19.0%
- 6
14.3%
9 5
11.9%
6 3
 
7.1%
2 3
 
7.1%
7 3
 
7.1%
5 2
 
4.8%
8 2
 
4.8%
3 1
 
2.4%

footprintSRS
Text

Missing 

Distinct3
Distinct (%)100.0%
Missing2361470
Missing (%)> 99.9%
Memory size18.0 MiB
2025-01-08T17:46:37.415022image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length2.333333333
Min length1

Characters and Unicode

Total characters7
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)100.0%

Sample

1st row9
2nd row153
3rd row239
ValueCountFrequency (%)
9 1
33.3%
153 1
33.3%
239 1
33.3%
2025-01-08T17:46:37.518124image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
9 2
28.6%
3 2
28.6%
1 1
14.3%
5 1
14.3%
2 1
14.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 7
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
9 2
28.6%
3 2
28.6%
1 1
14.3%
5 1
14.3%
2 1
14.3%

Most occurring scripts

ValueCountFrequency (%)
Common 7
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
9 2
28.6%
3 2
28.6%
1 1
14.3%
5 1
14.3%
2 1
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
9 2
28.6%
3 2
28.6%
1 1
14.3%
5 1
14.3%
2 1
14.3%

footprintSpatialFit
Text

Missing 

Distinct4
Distinct (%)100.0%
Missing2361469
Missing (%)> 99.9%
Memory size18.0 MiB
2025-01-08T17:46:37.573170image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length71
Median length37
Mean length19.5
Min length1

Characters and Unicode

Total characters78
Distinct characters31
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)100.0%

Sample

1st row9
2nd row153
3rd row239
4th rowJohn's Hope, Willings Forest Reserve, trail up from dam through forest.
ValueCountFrequency (%)
forest 2
14.3%
9 1
 
7.1%
153 1
 
7.1%
239 1
 
7.1%
john's 1
 
7.1%
hope 1
 
7.1%
willings 1
 
7.1%
reserve 1
 
7.1%
trail 1
 
7.1%
up 1
 
7.1%
Other values (3) 3
21.4%
2025-01-08T17:46:37.683600image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
10
 
12.8%
o 6
 
7.7%
r 6
 
7.7%
e 6
 
7.7%
s 5
 
6.4%
t 4
 
5.1%
i 3
 
3.8%
h 3
 
3.8%
l 3
 
3.8%
9 2
 
2.6%
Other values (21) 30
38.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 52
66.7%
Space Separator 10
 
12.8%
Decimal Number 7
 
9.0%
Uppercase Letter 5
 
6.4%
Other Punctuation 4
 
5.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 6
11.5%
r 6
11.5%
e 6
11.5%
s 5
9.6%
t 4
 
7.7%
i 3
 
5.8%
h 3
 
5.8%
l 3
 
5.8%
m 2
 
3.8%
f 2
 
3.8%
Other values (7) 12
23.1%
Decimal Number
ValueCountFrequency (%)
9 2
28.6%
3 2
28.6%
1 1
14.3%
2 1
14.3%
5 1
14.3%
Uppercase Letter
ValueCountFrequency (%)
W 1
20.0%
H 1
20.0%
F 1
20.0%
R 1
20.0%
J 1
20.0%
Other Punctuation
ValueCountFrequency (%)
, 2
50.0%
' 1
25.0%
. 1
25.0%
Space Separator
ValueCountFrequency (%)
10
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 57
73.1%
Common 21
 
26.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 6
 
10.5%
r 6
 
10.5%
e 6
 
10.5%
s 5
 
8.8%
t 4
 
7.0%
i 3
 
5.3%
h 3
 
5.3%
l 3
 
5.3%
m 2
 
3.5%
f 2
 
3.5%
Other values (12) 17
29.8%
Common
ValueCountFrequency (%)
10
47.6%
9 2
 
9.5%
, 2
 
9.5%
3 2
 
9.5%
1 1
 
4.8%
' 1
 
4.8%
2 1
 
4.8%
5 1
 
4.8%
. 1
 
4.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 78
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
10
 
12.8%
o 6
 
7.7%
r 6
 
7.7%
e 6
 
7.7%
s 5
 
6.4%
t 4
 
5.1%
i 3
 
3.8%
h 3
 
3.8%
l 3
 
3.8%
9 2
 
2.6%
Other values (21) 30
38.5%

georeferencedBy
Text

Missing 

Distinct9
Distinct (%)100.0%
Missing2361464
Missing (%)> 99.9%
Memory size18.0 MiB
2025-01-08T17:46:37.740680image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length29
Median length4
Mean length8.111111111
Min length4

Characters and Unicode

Total characters73
Distinct characters30
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9 ?
Unique (%)100.0%

Sample

1st rowDrosera L.
2nd row1961
3rd row2003
4th row1955
5th row1911
ValueCountFrequency (%)
drosera 1
 
7.7%
l 1
 
7.7%
1961 1
 
7.7%
2003 1
 
7.7%
1955 1
 
7.7%
1911 1
 
7.7%
1889-03-29 1
 
7.7%
1907 1
 
7.7%
miconia 1
 
7.7%
coronata 1
 
7.7%
Other values (3) 3
23.1%
2025-01-08T17:46:37.850371image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 9
 
12.3%
9 6
 
8.2%
o 5
 
6.8%
a 4
 
5.5%
4
 
5.5%
0 4
 
5.5%
r 3
 
4.1%
n 3
 
4.1%
. 3
 
4.1%
8 3
 
4.1%
Other values (20) 29
39.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 32
43.8%
Lowercase Letter 24
32.9%
Uppercase Letter 6
 
8.2%
Space Separator 4
 
5.5%
Other Punctuation 3
 
4.1%
Dash Punctuation 2
 
2.7%
Open Punctuation 1
 
1.4%
Close Punctuation 1
 
1.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 5
20.8%
a 4
16.7%
r 3
12.5%
n 3
12.5%
c 2
 
8.3%
i 2
 
8.3%
e 1
 
4.2%
s 1
 
4.2%
t 1
 
4.2%
p 1
 
4.2%
Decimal Number
ValueCountFrequency (%)
1 9
28.1%
9 6
18.8%
0 4
12.5%
8 3
 
9.4%
7 2
 
6.2%
5 2
 
6.2%
3 2
 
6.2%
2 2
 
6.2%
6 2
 
6.2%
Uppercase Letter
ValueCountFrequency (%)
D 2
33.3%
M 1
16.7%
L 1
16.7%
B 1
16.7%
C 1
16.7%
Space Separator
ValueCountFrequency (%)
4
100.0%
Other Punctuation
ValueCountFrequency (%)
. 3
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 43
58.9%
Latin 30
41.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 5
16.7%
a 4
13.3%
r 3
10.0%
n 3
10.0%
D 2
 
6.7%
c 2
 
6.7%
i 2
 
6.7%
M 1
 
3.3%
L 1
 
3.3%
e 1
 
3.3%
Other values (6) 6
20.0%
Common
ValueCountFrequency (%)
1 9
20.9%
9 6
14.0%
4
9.3%
0 4
9.3%
. 3
 
7.0%
8 3
 
7.0%
7 2
 
4.7%
- 2
 
4.7%
5 2
 
4.7%
3 2
 
4.7%
Other values (4) 6
14.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 73
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 9
 
12.3%
9 6
 
8.2%
o 5
 
6.8%
a 4
 
5.5%
4
 
5.5%
0 4
 
5.5%
r 3
 
4.1%
n 3
 
4.1%
. 3
 
4.1%
8 3
 
4.1%
Other values (20) 29
39.7%

georeferencedDate
Text

Missing 

Distinct3
Distinct (%)100.0%
Missing2361470
Missing (%)> 99.9%
Memory size18.0 MiB
2025-01-08T17:46:37.895372image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters3
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)100.0%

Sample

1st row1
2nd row6
3rd row8
ValueCountFrequency (%)
1 1
33.3%
6 1
33.3%
8 1
33.3%
2025-01-08T17:46:37.982803image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 1
33.3%
6 1
33.3%
8 1
33.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 1
33.3%
6 1
33.3%
8 1
33.3%

Most occurring scripts

ValueCountFrequency (%)
Common 3
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 1
33.3%
6 1
33.3%
8 1
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 1
33.3%
6 1
33.3%
8 1
33.3%

georeferenceProtocol
Text

Missing 

Distinct2394
Distinct (%)0.8%
Missing2055868
Missing (%)87.1%
Memory size18.0 MiB
2025-01-08T17:46:38.110468image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length302
Median length300
Mean length25.55988613
Min length1

Characters and Unicode

Total characters7811229
Distinct characters79
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique798 ?
Unique (%)0.3%

Sample

1st rowunknown, from legacy
2nd rowGEOLocate
3rd rowArcGIS software with data from New Mexico Resource Geographic Information System Program (http://rgis.unm.edu) and other inhouse resources (historical maps aiding with name changes), MaNIS/HerpNET/ORNIS Georeferencing Guidelines
4th rowGoogle Earth
5th rowAlexandria Digital Library Gazetteer, MaNIS/HerpNET/ORNIS Georeferencing Guidelines
ValueCountFrequency (%)
from 130830
 
12.9%
unknown 129199
 
12.8%
legacy 128679
 
12.7%
google 54944
 
5.4%
earth 40154
 
4.0%
geolocate 36281
 
3.6%
georeferencing 34967
 
3.5%
manis/herpnet/ornis 34312
 
3.4%
guidelines 34310
 
3.4%
gazetteer 20215
 
2.0%
Other values (2835) 367943
36.4%
2025-01-08T17:46:38.314435image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
706229
 
9.0%
e 670307
 
8.6%
o 567440
 
7.3%
n 560593
 
7.2%
a 452507
 
5.8%
r 410585
 
5.3%
l 272912
 
3.5%
g 263573
 
3.4%
G 248223
 
3.2%
c 246755
 
3.2%
Other values (69) 3412105
43.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 5286976
67.7%
Uppercase Letter 1165753
 
14.9%
Space Separator 706229
 
9.0%
Other Punctuation 338184
 
4.3%
Decimal Number 246713
 
3.2%
Open Punctuation 24664
 
0.3%
Close Punctuation 24612
 
0.3%
Dash Punctuation 17941
 
0.2%
Math Symbol 95
 
< 0.1%
Connector Punctuation 62
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 670307
12.7%
o 567440
 
10.7%
n 560593
 
10.6%
a 452507
 
8.6%
r 410585
 
7.8%
l 272912
 
5.2%
g 263573
 
5.0%
c 246755
 
4.7%
u 211476
 
4.0%
i 207930
 
3.9%
Other values (17) 1422898
26.9%
Uppercase Letter
ValueCountFrequency (%)
G 248223
21.3%
S 136912
11.7%
N 129186
11.1%
E 114336
9.8%
I 85150
 
7.3%
O 74320
 
6.4%
M 68715
 
5.9%
T 66001
 
5.7%
L 48246
 
4.1%
R 38810
 
3.3%
Other values (15) 155854
13.4%
Other Punctuation
ValueCountFrequency (%)
, 201579
59.6%
/ 75784
 
22.4%
: 26838
 
7.9%
. 25004
 
7.4%
; 3908
 
1.2%
! 2240
 
0.7%
# 1769
 
0.5%
' 686
 
0.2%
& 350
 
0.1%
? 21
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
0 112148
45.5%
2 38851
 
15.7%
1 35160
 
14.3%
4 20531
 
8.3%
5 10306
 
4.2%
9 7460
 
3.0%
7 6562
 
2.7%
6 6475
 
2.6%
3 5706
 
2.3%
8 3514
 
1.4%
Space Separator
ValueCountFrequency (%)
706229
100.0%
Open Punctuation
ValueCountFrequency (%)
( 24664
100.0%
Close Punctuation
ValueCountFrequency (%)
) 24612
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 17941
100.0%
Math Symbol
ValueCountFrequency (%)
+ 95
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 62
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 6452729
82.6%
Common 1358500
 
17.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 670307
 
10.4%
o 567440
 
8.8%
n 560593
 
8.7%
a 452507
 
7.0%
r 410585
 
6.4%
l 272912
 
4.2%
g 263573
 
4.1%
G 248223
 
3.8%
c 246755
 
3.8%
u 211476
 
3.3%
Other values (42) 2548358
39.5%
Common
ValueCountFrequency (%)
706229
52.0%
, 201579
 
14.8%
0 112148
 
8.3%
/ 75784
 
5.6%
2 38851
 
2.9%
1 35160
 
2.6%
: 26838
 
2.0%
. 25004
 
1.8%
( 24664
 
1.8%
) 24612
 
1.8%
Other values (17) 87631
 
6.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7810197
> 99.9%
None 1032
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
706229
 
9.0%
e 670307
 
8.6%
o 567440
 
7.3%
n 560593
 
7.2%
a 452507
 
5.8%
r 410585
 
5.3%
l 272912
 
3.5%
g 263573
 
3.4%
G 248223
 
3.2%
c 246755
 
3.2%
Other values (68) 3411073
43.7%
None
ValueCountFrequency (%)
í 1032
100.0%

georeferenceSources
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing2361471
Missing (%)> 99.9%
Memory size18.0 MiB
2025-01-08T17:46:38.365933image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length23
Median length12.5
Mean length12.5
Min length2

Characters and Unicode

Total characters25
Distinct characters7
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st row88
2nd row1876 00 00 - 0000 00 00
ValueCountFrequency (%)
00 4
50.0%
88 1
 
12.5%
1876 1
 
12.5%
1
 
12.5%
0000 1
 
12.5%
2025-01-08T17:46:38.455285image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 12
48.0%
6
24.0%
8 3
 
12.0%
1 1
 
4.0%
7 1
 
4.0%
6 1
 
4.0%
- 1
 
4.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 18
72.0%
Space Separator 6
 
24.0%
Dash Punctuation 1
 
4.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 12
66.7%
8 3
 
16.7%
1 1
 
5.6%
7 1
 
5.6%
6 1
 
5.6%
Space Separator
ValueCountFrequency (%)
6
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 25
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 12
48.0%
6
24.0%
8 3
 
12.0%
1 1
 
4.0%
7 1
 
4.0%
6 1
 
4.0%
- 1
 
4.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 25
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 12
48.0%
6
24.0%
8 3
 
12.0%
1 1
 
4.0%
7 1
 
4.0%
6 1
 
4.0%
- 1
 
4.0%

georeferenceRemarks
Text

Missing 

Distinct4929
Distinct (%)9.5%
Missing2309427
Missing (%)97.8%
Memory size18.0 MiB
2025-01-08T17:46:38.627020image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length182
Median length126
Mean length21.75160435
Min length1

Characters and Unicode

Total characters1132084
Distinct characters81
Distinct categories11 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2478 ?
Unique (%)4.8%

Sample

1st rowLocality extent = 400 m
2nd rowLocality extent = 0.6
3rd rowLocality extent = 1.059 mi.
4th rowLocality extent = 800 m
5th rowCoordinate Uncertainty In Meters: 44967
ValueCountFrequency (%)
locality 34471
16.6%
34367
16.6%
extent 34334
16.6%
mi 10224
 
4.9%
ca 4838
 
2.3%
km 2968
 
1.4%
approximate 2550
 
1.2%
in 2301
 
1.1%
coordinate 2093
 
1.0%
meters 2084
 
1.0%
Other values (5049) 76888
37.1%
2025-01-08T17:46:38.889400image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
155072
 
13.7%
t 128158
 
11.3%
e 96698
 
8.5%
a 61605
 
5.4%
i 59118
 
5.2%
o 54266
 
4.8%
n 53362
 
4.7%
l 42258
 
3.7%
c 39954
 
3.5%
. 39122
 
3.5%
Other values (71) 402471
35.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 709218
62.6%
Space Separator 155072
 
13.7%
Decimal Number 108626
 
9.6%
Uppercase Letter 78739
 
7.0%
Other Punctuation 45167
 
4.0%
Math Symbol 34336
 
3.0%
Dash Punctuation 538
 
< 0.1%
Open Punctuation 193
 
< 0.1%
Close Punctuation 193
 
< 0.1%
Initial Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 128158
18.1%
e 96698
13.6%
a 61605
8.7%
i 59118
8.3%
o 54266
7.7%
n 53362
7.5%
l 42258
 
6.0%
c 39954
 
5.6%
y 38215
 
5.4%
x 37841
 
5.3%
Other values (16) 97743
13.8%
Uppercase Letter
ValueCountFrequency (%)
L 34663
44.0%
C 8720
 
11.1%
A 5444
 
6.9%
M 2925
 
3.7%
I 2617
 
3.3%
G 2500
 
3.2%
P 2245
 
2.9%
U 2183
 
2.8%
D 2151
 
2.7%
S 2020
 
2.6%
Other values (16) 13271
 
16.9%
Decimal Number
ValueCountFrequency (%)
0 20572
18.9%
1 18161
16.7%
5 14508
13.4%
2 13297
12.2%
3 10577
9.7%
6 8291
7.6%
4 6768
 
6.2%
7 6465
 
6.0%
8 5819
 
5.4%
9 4168
 
3.8%
Other Punctuation
ValueCountFrequency (%)
. 39122
86.6%
: 2227
 
4.9%
, 1593
 
3.5%
; 1587
 
3.5%
/ 525
 
1.2%
' 93
 
0.2%
" 9
 
< 0.1%
& 6
 
< 0.1%
# 5
 
< 0.1%
Math Symbol
ValueCountFrequency (%)
= 34320
> 99.9%
+ 16
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
( 189
97.9%
[ 4
 
2.1%
Close Punctuation
ValueCountFrequency (%)
) 189
97.9%
] 4
 
2.1%
Space Separator
ValueCountFrequency (%)
155072
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 538
100.0%
Initial Punctuation
ValueCountFrequency (%)
1
100.0%
Final Punctuation
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 787957
69.6%
Common 344127
30.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 128158
16.3%
e 96698
12.3%
a 61605
 
7.8%
i 59118
 
7.5%
o 54266
 
6.9%
n 53362
 
6.8%
l 42258
 
5.4%
c 39954
 
5.1%
y 38215
 
4.8%
x 37841
 
4.8%
Other values (42) 176482
22.4%
Common
ValueCountFrequency (%)
155072
45.1%
. 39122
 
11.4%
= 34320
 
10.0%
0 20572
 
6.0%
1 18161
 
5.3%
5 14508
 
4.2%
2 13297
 
3.9%
3 10577
 
3.1%
6 8291
 
2.4%
4 6768
 
2.0%
Other values (19) 23439
 
6.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1132082
> 99.9%
Punctuation 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
155072
 
13.7%
t 128158
 
11.3%
e 96698
 
8.5%
a 61605
 
5.4%
i 59118
 
5.2%
o 54266
 
4.8%
n 53362
 
4.7%
l 42258
 
3.7%
c 39954
 
3.5%
. 39122
 
3.5%
Other values (69) 402469
35.6%
Punctuation
ValueCountFrequency (%)
1
50.0%
1
50.0%

geologicalContextID
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing2361472
Missing (%)> 99.9%
Memory size18.0 MiB
2025-01-08T17:46:38.938794image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1
Distinct characters1
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row3
ValueCountFrequency (%)
3 1
100.0%
2025-01-08T17:46:39.025442image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3 1
100.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
3 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3 1
100.0%

earliestEonOrLowestEonothem
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing2361472
Missing (%)> 99.9%
Memory size18.0 MiB
2025-01-08T17:46:39.062442image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters2
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row29
ValueCountFrequency (%)
29 1
100.0%
2025-01-08T17:46:39.147635image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 1
50.0%
9 1
50.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 1
50.0%
9 1
50.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 1
50.0%
9 1
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 1
50.0%
9 1
50.0%
Distinct3
Distinct (%)100.0%
Missing2361470
Missing (%)> 99.9%
Memory size18.0 MiB
2025-01-08T17:46:39.199951image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length67
Median length51
Mean length43
Min length11

Characters and Unicode

Total characters129
Distinct characters25
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)100.0%

Sample

1st rowPlantae, Dicotyledonae, Caryophyllales, Droseraceae
2nd row29 Mar 1889
3rd rowPlantae, Dicotyledonae, Myrtales, Melastomataceae, Melastomatoideae
ValueCountFrequency (%)
plantae 2
16.7%
dicotyledonae 2
16.7%
caryophyllales 1
8.3%
droseraceae 1
8.3%
29 1
8.3%
mar 1
8.3%
1889 1
8.3%
myrtales 1
8.3%
melastomataceae 1
8.3%
melastomatoideae 1
8.3%
2025-01-08T17:46:39.314143image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 19
14.7%
e 17
13.2%
l 10
 
7.8%
t 9
 
7.0%
9
 
7.0%
o 9
 
7.0%
, 7
 
5.4%
y 5
 
3.9%
r 5
 
3.9%
s 5
 
3.9%
Other values (15) 34
26.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 97
75.2%
Uppercase Letter 10
 
7.8%
Space Separator 9
 
7.0%
Other Punctuation 7
 
5.4%
Decimal Number 6
 
4.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 19
19.6%
e 17
17.5%
l 10
10.3%
t 9
9.3%
o 9
9.3%
y 5
 
5.2%
r 5
 
5.2%
s 5
 
5.2%
n 4
 
4.1%
c 4
 
4.1%
Other values (5) 10
10.3%
Uppercase Letter
ValueCountFrequency (%)
M 4
40.0%
D 3
30.0%
P 2
20.0%
C 1
 
10.0%
Decimal Number
ValueCountFrequency (%)
8 2
33.3%
9 2
33.3%
2 1
16.7%
1 1
16.7%
Space Separator
ValueCountFrequency (%)
9
100.0%
Other Punctuation
ValueCountFrequency (%)
, 7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 107
82.9%
Common 22
 
17.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 19
17.8%
e 17
15.9%
l 10
9.3%
t 9
8.4%
o 9
8.4%
y 5
 
4.7%
r 5
 
4.7%
s 5
 
4.7%
n 4
 
3.7%
M 4
 
3.7%
Other values (9) 20
18.7%
Common
ValueCountFrequency (%)
9
40.9%
, 7
31.8%
8 2
 
9.1%
9 2
 
9.1%
2 1
 
4.5%
1 1
 
4.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 129
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 19
14.7%
e 17
13.2%
l 10
 
7.8%
t 9
 
7.0%
9
 
7.0%
o 9
 
7.0%
, 7
 
5.4%
y 5
 
3.9%
r 5
 
3.9%
s 5
 
3.9%
Other values (15) 34
26.4%

earliestEraOrLowestErathem
Text

Constant  Missing 

Distinct1
Distinct (%)50.0%
Missing2361471
Missing (%)> 99.9%
Memory size18.0 MiB
2025-01-08T17:46:39.358900image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters14
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPlantae
2nd rowPlantae
ValueCountFrequency (%)
plantae 2
100.0%
2025-01-08T17:46:39.446732image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 4
28.6%
P 2
14.3%
l 2
14.3%
n 2
14.3%
t 2
14.3%
e 2
14.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 12
85.7%
Uppercase Letter 2
 
14.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 4
33.3%
l 2
16.7%
n 2
16.7%
t 2
16.7%
e 2
16.7%
Uppercase Letter
ValueCountFrequency (%)
P 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 14
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 4
28.6%
P 2
14.3%
l 2
14.3%
n 2
14.3%
t 2
14.3%
e 2
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 14
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 4
28.6%
P 2
14.3%
l 2
14.3%
n 2
14.3%
t 2
14.3%
e 2
14.3%

latestEraOrHighestErathem
Text

Constant  Missing 

Distinct1
Distinct (%)50.0%
Missing2361471
Missing (%)> 99.9%
Memory size18.0 MiB
2025-01-08T17:46:39.489732image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length12
Median length12
Mean length12
Min length12

Characters and Unicode

Total characters24
Distinct characters10
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowTracheophyta
2nd rowTracheophyta
ValueCountFrequency (%)
tracheophyta 2
100.0%
2025-01-08T17:46:39.585832image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 4
16.7%
h 4
16.7%
T 2
8.3%
r 2
8.3%
c 2
8.3%
e 2
8.3%
o 2
8.3%
p 2
8.3%
y 2
8.3%
t 2
8.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 22
91.7%
Uppercase Letter 2
 
8.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 4
18.2%
h 4
18.2%
r 2
9.1%
c 2
9.1%
e 2
9.1%
o 2
9.1%
p 2
9.1%
y 2
9.1%
t 2
9.1%
Uppercase Letter
ValueCountFrequency (%)
T 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 24
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 4
16.7%
h 4
16.7%
T 2
8.3%
r 2
8.3%
c 2
8.3%
e 2
8.3%
o 2
8.3%
p 2
8.3%
y 2
8.3%
t 2
8.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 24
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 4
16.7%
h 4
16.7%
T 2
8.3%
r 2
8.3%
c 2
8.3%
e 2
8.3%
o 2
8.3%
p 2
8.3%
y 2
8.3%
t 2
8.3%

earliestPeriodOrLowestSystem
Text

Constant  Missing 

Distinct1
Distinct (%)50.0%
Missing2361471
Missing (%)> 99.9%
Memory size18.0 MiB
2025-01-08T17:46:39.631870image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length13
Mean length13
Min length13

Characters and Unicode

Total characters26
Distinct characters10
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMagnoliopsida
2nd rowMagnoliopsida
ValueCountFrequency (%)
magnoliopsida 2
100.0%
2025-01-08T17:46:39.724666image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 4
15.4%
o 4
15.4%
i 4
15.4%
M 2
7.7%
g 2
7.7%
n 2
7.7%
l 2
7.7%
p 2
7.7%
s 2
7.7%
d 2
7.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 24
92.3%
Uppercase Letter 2
 
7.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 4
16.7%
o 4
16.7%
i 4
16.7%
g 2
8.3%
n 2
8.3%
l 2
8.3%
p 2
8.3%
s 2
8.3%
d 2
8.3%
Uppercase Letter
ValueCountFrequency (%)
M 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 26
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 4
15.4%
o 4
15.4%
i 4
15.4%
M 2
7.7%
g 2
7.7%
n 2
7.7%
l 2
7.7%
p 2
7.7%
s 2
7.7%
d 2
7.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 26
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 4
15.4%
o 4
15.4%
i 4
15.4%
M 2
7.7%
g 2
7.7%
n 2
7.7%
l 2
7.7%
p 2
7.7%
s 2
7.7%
d 2
7.7%
Distinct2
Distinct (%)100.0%
Missing2361471
Missing (%)> 99.9%
Memory size18.0 MiB
2025-01-08T17:46:39.770224image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length14
Median length11
Mean length11
Min length8

Characters and Unicode

Total characters22
Distinct characters12
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st rowCaryophyllales
2nd rowMyrtales
ValueCountFrequency (%)
caryophyllales 1
50.0%
myrtales 1
50.0%
2025-01-08T17:46:39.870385image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
l 4
18.2%
a 3
13.6%
y 3
13.6%
r 2
9.1%
e 2
9.1%
s 2
9.1%
C 1
 
4.5%
o 1
 
4.5%
p 1
 
4.5%
h 1
 
4.5%
Other values (2) 2
9.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 20
90.9%
Uppercase Letter 2
 
9.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
l 4
20.0%
a 3
15.0%
y 3
15.0%
r 2
10.0%
e 2
10.0%
s 2
10.0%
o 1
 
5.0%
p 1
 
5.0%
h 1
 
5.0%
t 1
 
5.0%
Uppercase Letter
ValueCountFrequency (%)
C 1
50.0%
M 1
50.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 22
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
l 4
18.2%
a 3
13.6%
y 3
13.6%
r 2
9.1%
e 2
9.1%
s 2
9.1%
C 1
 
4.5%
o 1
 
4.5%
p 1
 
4.5%
h 1
 
4.5%
Other values (2) 2
9.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 22
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
l 4
18.2%
a 3
13.6%
y 3
13.6%
r 2
9.1%
e 2
9.1%
s 2
9.1%
C 1
 
4.5%
o 1
 
4.5%
p 1
 
4.5%
h 1
 
4.5%
Other values (2) 2
9.1%

earliestEpochOrLowestSeries
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing2361472
Missing (%)> 99.9%
Memory size18.0 MiB
2025-01-08T17:46:39.913385image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters7
Distinct characters6
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row5410907
ValueCountFrequency (%)
5410907 1
100.0%
2025-01-08T17:46:40.003010image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 2
28.6%
5 1
14.3%
4 1
14.3%
1 1
14.3%
9 1
14.3%
7 1
14.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 7
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 2
28.6%
5 1
14.3%
4 1
14.3%
1 1
14.3%
9 1
14.3%
7 1
14.3%

Most occurring scripts

ValueCountFrequency (%)
Common 7
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 2
28.6%
5 1
14.3%
4 1
14.3%
1 1
14.3%
9 1
14.3%
7 1
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 2
28.6%
5 1
14.3%
4 1
14.3%
1 1
14.3%
9 1
14.3%
7 1
14.3%
Distinct8
Distinct (%)100.0%
Missing2361465
Missing (%)> 99.9%
Memory size18.0 MiB
2025-01-08T17:46:40.063203image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length69
Median length39.5
Mean length35.875
Min length11

Characters and Unicode

Total characters287
Distinct characters32
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8 ?
Unique (%)100.0%

Sample

1st rowDroseraceae
2nd rowAsia, Taiwan
3rd rowNorth America, United States, Oklahoma, Pontotoc County
4th rowNorth America, United States, Alaska
5th rowNorth America, United States, Massachusetts
ValueCountFrequency (%)
north 5
14.3%
united 5
14.3%
states 5
14.3%
america 4
11.4%
county 2
 
5.7%
massachusetts 2
 
5.7%
ocean 1
 
2.9%
atlantic 1
 
2.9%
melastomataceae 1
 
2.9%
cochise 1
 
2.9%
Other values (8) 8
22.9%
2025-01-08T17:46:40.181338image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
t 33
 
11.5%
a 31
 
10.8%
27
 
9.4%
e 25
 
8.7%
s 19
 
6.6%
o 15
 
5.2%
i 14
 
4.9%
, 14
 
4.9%
n 13
 
4.5%
r 13
 
4.5%
Other values (22) 83
28.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 211
73.5%
Uppercase Letter 35
 
12.2%
Space Separator 27
 
9.4%
Other Punctuation 14
 
4.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 33
15.6%
a 31
14.7%
e 25
11.8%
s 19
9.0%
o 15
7.1%
i 14
6.6%
n 13
 
6.2%
r 13
 
6.2%
c 12
 
5.7%
h 9
 
4.3%
Other values (9) 27
12.8%
Uppercase Letter
ValueCountFrequency (%)
A 8
22.9%
N 5
14.3%
U 5
14.3%
S 5
14.3%
M 3
 
8.6%
C 3
 
8.6%
O 2
 
5.7%
B 1
 
2.9%
D 1
 
2.9%
P 1
 
2.9%
Space Separator
ValueCountFrequency (%)
27
100.0%
Other Punctuation
ValueCountFrequency (%)
, 14
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 246
85.7%
Common 41
 
14.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 33
13.4%
a 31
12.6%
e 25
10.2%
s 19
 
7.7%
o 15
 
6.1%
i 14
 
5.7%
n 13
 
5.3%
r 13
 
5.3%
c 12
 
4.9%
h 9
 
3.7%
Other values (20) 62
25.2%
Common
ValueCountFrequency (%)
27
65.9%
, 14
34.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 287
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t 33
 
11.5%
a 31
 
10.8%
27
 
9.4%
e 25
 
8.7%
s 19
 
6.6%
o 15
 
5.2%
i 14
 
4.9%
, 14
 
4.9%
n 13
 
4.5%
r 13
 
4.5%
Other values (22) 83
28.9%
Distinct2
Distinct (%)40.0%
Missing2361468
Missing (%)> 99.9%
Memory size18.0 MiB
2025-01-08T17:46:40.230902image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length13
Mean length11.2
Min length4

Characters and Unicode

Total characters56
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)20.0%

Sample

1st rowASIA
2nd rowNORTH_AMERICA
3rd rowNORTH_AMERICA
4th rowNORTH_AMERICA
5th rowNORTH_AMERICA
ValueCountFrequency (%)
north_america 4
80.0%
asia 1
 
20.0%
2025-01-08T17:46:40.323857image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 10
17.9%
R 8
14.3%
I 5
8.9%
N 4
 
7.1%
O 4
 
7.1%
T 4
 
7.1%
H 4
 
7.1%
_ 4
 
7.1%
M 4
 
7.1%
E 4
 
7.1%
Other values (2) 5
8.9%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 52
92.9%
Connector Punctuation 4
 
7.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 10
19.2%
R 8
15.4%
I 5
9.6%
N 4
 
7.7%
O 4
 
7.7%
T 4
 
7.7%
H 4
 
7.7%
M 4
 
7.7%
E 4
 
7.7%
C 4
 
7.7%
Connector Punctuation
ValueCountFrequency (%)
_ 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 52
92.9%
Common 4
 
7.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 10
19.2%
R 8
15.4%
I 5
9.6%
N 4
 
7.7%
O 4
 
7.7%
T 4
 
7.7%
H 4
 
7.7%
M 4
 
7.7%
E 4
 
7.7%
C 4
 
7.7%
Common
ValueCountFrequency (%)
_ 4
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 56
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 10
17.9%
R 8
14.3%
I 5
8.9%
N 4
 
7.1%
O 4
 
7.1%
T 4
 
7.1%
H 4
 
7.1%
_ 4
 
7.1%
M 4
 
7.1%
E 4
 
7.1%
Other values (2) 5
8.9%

latestAgeOrHighestStage
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing2361472
Missing (%)> 99.9%
Memory size18.0 MiB
2025-01-08T17:46:40.483415image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length20
Median length20
Mean length20
Min length20

Characters and Unicode

Total characters20
Distinct characters14
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowNorth Atlantic Ocean
ValueCountFrequency (%)
north 1
33.3%
atlantic 1
33.3%
ocean 1
33.3%
2025-01-08T17:46:40.577027image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
t 3
15.0%
2
10.0%
a 2
10.0%
n 2
10.0%
c 2
10.0%
N 1
 
5.0%
o 1
 
5.0%
r 1
 
5.0%
h 1
 
5.0%
A 1
 
5.0%
Other values (4) 4
20.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 15
75.0%
Uppercase Letter 3
 
15.0%
Space Separator 2
 
10.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 3
20.0%
a 2
13.3%
n 2
13.3%
c 2
13.3%
o 1
 
6.7%
r 1
 
6.7%
h 1
 
6.7%
l 1
 
6.7%
i 1
 
6.7%
e 1
 
6.7%
Uppercase Letter
ValueCountFrequency (%)
N 1
33.3%
A 1
33.3%
O 1
33.3%
Space Separator
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 18
90.0%
Common 2
 
10.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 3
16.7%
a 2
11.1%
n 2
11.1%
c 2
11.1%
N 1
 
5.6%
o 1
 
5.6%
r 1
 
5.6%
h 1
 
5.6%
A 1
 
5.6%
l 1
 
5.6%
Other values (3) 3
16.7%
Common
ValueCountFrequency (%)
2
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 20
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t 3
15.0%
2
10.0%
a 2
10.0%
n 2
10.0%
c 2
10.0%
N 1
 
5.0%
o 1
 
5.0%
r 1
 
5.0%
h 1
 
5.0%
A 1
 
5.0%
Other values (4) 4
20.0%

lowestBiostratigraphicZone
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing2361472
Missing (%)> 99.9%
Memory size18.0 MiB
2025-01-08T17:46:40.617599image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters10
Distinct characters10
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowScharf, U.
ValueCountFrequency (%)
scharf 1
50.0%
u 1
50.0%
2025-01-08T17:46:40.707633image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
S 1
10.0%
c 1
10.0%
h 1
10.0%
a 1
10.0%
r 1
10.0%
f 1
10.0%
, 1
10.0%
1
10.0%
U 1
10.0%
. 1
10.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 5
50.0%
Uppercase Letter 2
 
20.0%
Other Punctuation 2
 
20.0%
Space Separator 1
 
10.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
c 1
20.0%
h 1
20.0%
a 1
20.0%
r 1
20.0%
f 1
20.0%
Uppercase Letter
ValueCountFrequency (%)
S 1
50.0%
U 1
50.0%
Other Punctuation
ValueCountFrequency (%)
, 1
50.0%
. 1
50.0%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 7
70.0%
Common 3
30.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
S 1
14.3%
c 1
14.3%
h 1
14.3%
a 1
14.3%
r 1
14.3%
f 1
14.3%
U 1
14.3%
Common
ValueCountFrequency (%)
, 1
33.3%
1
33.3%
. 1
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S 1
10.0%
c 1
10.0%
h 1
10.0%
a 1
10.0%
r 1
10.0%
f 1
10.0%
, 1
10.0%
1
10.0%
U 1
10.0%
. 1
10.0%
Distinct3
Distinct (%)100.0%
Missing2361470
Missing (%)> 99.9%
Memory size18.0 MiB
2025-01-08T17:46:40.761867image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length58
Median length7
Mean length24
Min length7

Characters and Unicode

Total characters72
Distinct characters29
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)100.0%

Sample

1st rowDrosera
2nd rowNorth America, Mexico, Baja California Norte, Guadalupe I.
3rd rowMiconia
ValueCountFrequency (%)
drosera 1
10.0%
north 1
10.0%
america 1
10.0%
mexico 1
10.0%
baja 1
10.0%
california 1
10.0%
norte 1
10.0%
guadalupe 1
10.0%
i 1
10.0%
miconia 1
10.0%
2025-01-08T17:46:40.867832image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 9
 
12.5%
7
 
9.7%
i 6
 
8.3%
o 6
 
8.3%
r 6
 
8.3%
e 5
 
6.9%
c 3
 
4.2%
, 3
 
4.2%
l 2
 
2.8%
u 2
 
2.8%
Other values (19) 23
31.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 51
70.8%
Uppercase Letter 10
 
13.9%
Space Separator 7
 
9.7%
Other Punctuation 4
 
5.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 9
17.6%
i 6
11.8%
o 6
11.8%
r 6
11.8%
e 5
9.8%
c 3
 
5.9%
l 2
 
3.9%
u 2
 
3.9%
t 2
 
3.9%
n 2
 
3.9%
Other values (8) 8
15.7%
Uppercase Letter
ValueCountFrequency (%)
N 2
20.0%
M 2
20.0%
I 1
10.0%
G 1
10.0%
D 1
10.0%
C 1
10.0%
B 1
10.0%
A 1
10.0%
Other Punctuation
ValueCountFrequency (%)
, 3
75.0%
. 1
 
25.0%
Space Separator
ValueCountFrequency (%)
7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 61
84.7%
Common 11
 
15.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 9
14.8%
i 6
 
9.8%
o 6
 
9.8%
r 6
 
9.8%
e 5
 
8.2%
c 3
 
4.9%
l 2
 
3.3%
u 2
 
3.3%
t 2
 
3.3%
N 2
 
3.3%
Other values (16) 18
29.5%
Common
ValueCountFrequency (%)
7
63.6%
, 3
27.3%
. 1
 
9.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 72
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 9
 
12.5%
7
 
9.7%
i 6
 
8.3%
o 6
 
8.3%
r 6
 
8.3%
e 5
 
6.9%
c 3
 
4.2%
, 3
 
4.2%
l 2
 
2.8%
u 2
 
2.8%
Other values (19) 23
31.9%
Distinct6
Distinct (%)60.0%
Missing2361463
Missing (%)> 99.9%
Memory size18.0 MiB
2025-01-08T17:46:40.915832image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length25
Median length2
Mean length6.4
Min length2

Characters and Unicode

Total characters64
Distinct characters33
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5 ?
Unique (%)50.0%

Sample

1st rowDrosera
2nd rowTW
3rd rowUS
4th rowCampanula rotundifolia L.
5th rowUS
ValueCountFrequency (%)
us 5
41.7%
drosera 1
 
8.3%
tw 1
 
8.3%
campanula 1
 
8.3%
rotundifolia 1
 
8.3%
l 1
 
8.3%
north_america 1
 
8.3%
miconia 1
 
8.3%
2025-01-08T17:46:41.013134image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 6
 
9.4%
U 5
 
7.8%
S 5
 
7.8%
o 4
 
6.2%
i 4
 
6.2%
n 3
 
4.7%
r 3
 
4.7%
M 2
 
3.1%
A 2
 
3.1%
R 2
 
3.1%
Other values (23) 28
43.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 32
50.0%
Uppercase Letter 28
43.8%
Space Separator 2
 
3.1%
Connector Punctuation 1
 
1.6%
Other Punctuation 1
 
1.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 6
18.8%
o 4
12.5%
i 4
12.5%
n 3
9.4%
r 3
9.4%
l 2
 
6.2%
u 2
 
6.2%
p 1
 
3.1%
s 1
 
3.1%
e 1
 
3.1%
Other values (5) 5
15.6%
Uppercase Letter
ValueCountFrequency (%)
U 5
17.9%
S 5
17.9%
M 2
 
7.1%
A 2
 
7.1%
R 2
 
7.1%
C 2
 
7.1%
T 2
 
7.1%
O 1
 
3.6%
I 1
 
3.6%
E 1
 
3.6%
Other values (5) 5
17.9%
Space Separator
ValueCountFrequency (%)
2
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 60
93.8%
Common 4
 
6.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 6
 
10.0%
U 5
 
8.3%
S 5
 
8.3%
o 4
 
6.7%
i 4
 
6.7%
n 3
 
5.0%
r 3
 
5.0%
M 2
 
3.3%
A 2
 
3.3%
R 2
 
3.3%
Other values (20) 24
40.0%
Common
ValueCountFrequency (%)
2
50.0%
_ 1
25.0%
. 1
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 64
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 6
 
9.4%
U 5
 
7.8%
S 5
 
7.8%
o 4
 
6.2%
i 4
 
6.2%
n 3
 
4.7%
r 3
 
4.7%
M 2
 
3.1%
A 2
 
3.1%
R 2
 
3.1%
Other values (23) 28
43.8%

group
Text

Missing 

Distinct4
Distinct (%)80.0%
Missing2361468
Missing (%)> 99.9%
Memory size18.0 MiB
2025-01-08T17:46:41.062836image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length8
Mean length9.4
Min length6

Characters and Unicode

Total characters47
Distinct characters18
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)60.0%

Sample

1st rowOklahoma
2nd rowAlaska
3rd rowMassachusetts
4th rowArizona
5th rowMassachusetts
ValueCountFrequency (%)
massachusetts 2
40.0%
oklahoma 1
20.0%
alaska 1
20.0%
arizona 1
20.0%
2025-01-08T17:46:41.164299image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 9
19.1%
s 9
19.1%
t 4
 
8.5%
h 3
 
6.4%
M 2
 
4.3%
A 2
 
4.3%
o 2
 
4.3%
l 2
 
4.3%
k 2
 
4.3%
e 2
 
4.3%
Other values (8) 10
21.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 42
89.4%
Uppercase Letter 5
 
10.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 9
21.4%
s 9
21.4%
t 4
9.5%
h 3
 
7.1%
o 2
 
4.8%
l 2
 
4.8%
k 2
 
4.8%
e 2
 
4.8%
u 2
 
4.8%
c 2
 
4.8%
Other values (5) 5
11.9%
Uppercase Letter
ValueCountFrequency (%)
M 2
40.0%
A 2
40.0%
O 1
20.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 47
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 9
19.1%
s 9
19.1%
t 4
 
8.5%
h 3
 
6.4%
M 2
 
4.3%
A 2
 
4.3%
o 2
 
4.3%
l 2
 
4.3%
k 2
 
4.3%
e 2
 
4.3%
Other values (8) 10
21.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 47
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 9
19.1%
s 9
19.1%
t 4
 
8.5%
h 3
 
6.4%
M 2
 
4.3%
A 2
 
4.3%
o 2
 
4.3%
l 2
 
4.3%
k 2
 
4.3%
e 2
 
4.3%
Other values (8) 10
21.3%

formation
Text

Missing 

Distinct3
Distinct (%)100.0%
Missing2361470
Missing (%)> 99.9%
Memory size18.0 MiB
2025-01-08T17:46:41.209008image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length17
Median length15
Mean length13
Min length7

Characters and Unicode

Total characters39
Distinct characters18
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)100.0%

Sample

1st rowPontotoc County
2nd rowCochise
3rd rowBarnstable County
ValueCountFrequency (%)
county 2
40.0%
pontotoc 1
20.0%
cochise 1
20.0%
barnstable 1
20.0%
2025-01-08T17:46:41.301399image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
o 6
15.4%
t 5
12.8%
n 4
10.3%
C 3
 
7.7%
a 2
 
5.1%
c 2
 
5.1%
2
 
5.1%
u 2
 
5.1%
y 2
 
5.1%
s 2
 
5.1%
Other values (8) 9
23.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 32
82.1%
Uppercase Letter 5
 
12.8%
Space Separator 2
 
5.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 6
18.8%
t 5
15.6%
n 4
12.5%
a 2
 
6.2%
c 2
 
6.2%
u 2
 
6.2%
y 2
 
6.2%
s 2
 
6.2%
e 2
 
6.2%
b 1
 
3.1%
Other values (4) 4
12.5%
Uppercase Letter
ValueCountFrequency (%)
C 3
60.0%
P 1
 
20.0%
B 1
 
20.0%
Space Separator
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 37
94.9%
Common 2
 
5.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 6
16.2%
t 5
13.5%
n 4
10.8%
C 3
8.1%
a 2
 
5.4%
c 2
 
5.4%
u 2
 
5.4%
y 2
 
5.4%
s 2
 
5.4%
e 2
 
5.4%
Other values (7) 7
18.9%
Common
ValueCountFrequency (%)
2
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 39
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 6
15.4%
t 5
12.8%
n 4
10.3%
C 3
 
7.7%
a 2
 
5.1%
c 2
 
5.1%
2
 
5.1%
u 2
 
5.1%
y 2
 
5.1%
s 2
 
5.1%
Other values (8) 9
23.1%

member
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing2361471
Missing (%)> 99.9%
Memory size18.0 MiB
2025-01-08T17:46:41.345175image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length12
Median length10
Mean length10
Min length8

Characters and Unicode

Total characters20
Distinct characters15
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st rowGuadalupe I.
2nd rowcoronata
ValueCountFrequency (%)
guadalupe 1
33.3%
i 1
33.3%
coronata 1
33.3%
2025-01-08T17:46:41.446534image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 4
20.0%
u 2
 
10.0%
o 2
 
10.0%
G 1
 
5.0%
d 1
 
5.0%
l 1
 
5.0%
p 1
 
5.0%
e 1
 
5.0%
1
 
5.0%
I 1
 
5.0%
Other values (5) 5
25.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 16
80.0%
Uppercase Letter 2
 
10.0%
Space Separator 1
 
5.0%
Other Punctuation 1
 
5.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 4
25.0%
u 2
12.5%
o 2
12.5%
d 1
 
6.2%
l 1
 
6.2%
p 1
 
6.2%
e 1
 
6.2%
c 1
 
6.2%
r 1
 
6.2%
n 1
 
6.2%
Uppercase Letter
ValueCountFrequency (%)
G 1
50.0%
I 1
50.0%
Space Separator
ValueCountFrequency (%)
1
100.0%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 18
90.0%
Common 2
 
10.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 4
22.2%
u 2
11.1%
o 2
11.1%
G 1
 
5.6%
d 1
 
5.6%
l 1
 
5.6%
p 1
 
5.6%
e 1
 
5.6%
I 1
 
5.6%
c 1
 
5.6%
Other values (3) 3
16.7%
Common
ValueCountFrequency (%)
1
50.0%
. 1
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 20
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 4
20.0%
u 2
 
10.0%
o 2
 
10.0%
G 1
 
5.0%
d 1
 
5.0%
l 1
 
5.0%
p 1
 
5.0%
e 1
 
5.0%
1
 
5.0%
I 1
 
5.0%
Other values (5) 5
25.0%

bed
Text

Missing 

Distinct6
Distinct (%)85.7%
Missing2361466
Missing (%)> 99.9%
Memory size18.0 MiB
2025-01-08T17:46:41.501534image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length49
Median length10
Mean length12.85714286
Min length2

Characters and Unicode

Total characters90
Distinct characters31
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5 ?
Unique (%)71.4%

Sample

1st rowPing-lin
2nd rowAda
3rd rowSeldovia
4th rowWoods Hole
5th rowMX
ValueCountFrequency (%)
woods 2
14.3%
hole 2
14.3%
ping-lin 1
7.1%
ada 1
7.1%
seldovia 1
7.1%
mx 1
7.1%
chiricahua 1
7.1%
mountains 1
7.1%
barfoot 1
7.1%
park 1
7.1%
Other values (2) 2
14.3%
2025-01-08T17:46:41.615840image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
o 12
 
13.3%
a 7
 
7.8%
7
 
7.8%
l 6
 
6.7%
i 6
 
6.7%
n 6
 
6.7%
s 5
 
5.6%
d 4
 
4.4%
r 3
 
3.3%
e 3
 
3.3%
Other values (21) 31
34.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 66
73.3%
Uppercase Letter 13
 
14.4%
Space Separator 7
 
7.8%
Other Punctuation 3
 
3.3%
Dash Punctuation 1
 
1.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 12
18.2%
a 7
10.6%
l 6
9.1%
i 6
9.1%
n 6
9.1%
s 5
7.6%
d 4
 
6.1%
r 3
 
4.5%
e 3
 
4.5%
t 3
 
4.5%
Other values (8) 11
16.7%
Uppercase Letter
ValueCountFrequency (%)
M 2
15.4%
W 2
15.4%
P 2
15.4%
H 2
15.4%
B 1
7.7%
S 1
7.7%
C 1
7.7%
X 1
7.7%
A 1
7.7%
Other Punctuation
ValueCountFrequency (%)
, 2
66.7%
. 1
33.3%
Space Separator
ValueCountFrequency (%)
7
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 79
87.8%
Common 11
 
12.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 12
15.2%
a 7
 
8.9%
l 6
 
7.6%
i 6
 
7.6%
n 6
 
7.6%
s 5
 
6.3%
d 4
 
5.1%
r 3
 
3.8%
e 3
 
3.8%
t 3
 
3.8%
Other values (17) 24
30.4%
Common
ValueCountFrequency (%)
7
63.6%
, 2
 
18.2%
- 1
 
9.1%
. 1
 
9.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 90
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 12
 
13.3%
a 7
 
7.8%
7
 
7.8%
l 6
 
6.7%
i 6
 
6.7%
n 6
 
6.7%
s 5
 
5.6%
d 4
 
4.4%
r 3
 
3.3%
e 3
 
3.3%
Other values (21) 31
34.4%

identificationID
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing2361472
Missing (%)> 99.9%
Memory size18.0 MiB
2025-01-08T17:46:41.664055image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length21
Median length21
Mean length21
Min length21

Characters and Unicode

Total characters21
Distinct characters14
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowBaja California Norte
ValueCountFrequency (%)
baja 1
33.3%
california 1
33.3%
norte 1
33.3%
2025-01-08T17:46:41.757341image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 4
19.0%
2
9.5%
i 2
9.5%
o 2
9.5%
r 2
9.5%
B 1
 
4.8%
j 1
 
4.8%
C 1
 
4.8%
l 1
 
4.8%
f 1
 
4.8%
Other values (4) 4
19.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 16
76.2%
Uppercase Letter 3
 
14.3%
Space Separator 2
 
9.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 4
25.0%
i 2
12.5%
o 2
12.5%
r 2
12.5%
j 1
 
6.2%
l 1
 
6.2%
f 1
 
6.2%
n 1
 
6.2%
t 1
 
6.2%
e 1
 
6.2%
Uppercase Letter
ValueCountFrequency (%)
B 1
33.3%
C 1
33.3%
N 1
33.3%
Space Separator
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 19
90.5%
Common 2
 
9.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 4
21.1%
i 2
10.5%
o 2
10.5%
r 2
10.5%
B 1
 
5.3%
j 1
 
5.3%
C 1
 
5.3%
l 1
 
5.3%
f 1
 
5.3%
n 1
 
5.3%
Other values (3) 3
15.8%
Common
ValueCountFrequency (%)
2
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 21
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 4
19.0%
2
9.5%
i 2
9.5%
o 2
9.5%
r 2
9.5%
B 1
 
4.8%
j 1
 
4.8%
C 1
 
4.8%
l 1
 
4.8%
f 1
 
4.8%
Other values (4) 4
19.0%
Distinct3
Distinct (%)100.0%
Missing2361470
Missing (%)> 99.9%
Memory size18.0 MiB
2025-01-08T17:46:41.801322image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length6.333333333
Min length5

Characters and Unicode

Total characters19
Distinct characters13
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)100.0%

Sample

1st rowGENUS
2nd row3155772
3rd rowSPECIES
ValueCountFrequency (%)
genus 1
33.3%
3155772 1
33.3%
species 1
33.3%
2025-01-08T17:46:41.901638image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
E 3
15.8%
S 3
15.8%
5 2
10.5%
7 2
10.5%
G 1
 
5.3%
N 1
 
5.3%
U 1
 
5.3%
3 1
 
5.3%
1 1
 
5.3%
2 1
 
5.3%
Other values (3) 3
15.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 12
63.2%
Decimal Number 7
36.8%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 3
25.0%
S 3
25.0%
G 1
 
8.3%
N 1
 
8.3%
U 1
 
8.3%
P 1
 
8.3%
C 1
 
8.3%
I 1
 
8.3%
Decimal Number
ValueCountFrequency (%)
5 2
28.6%
7 2
28.6%
3 1
14.3%
1 1
14.3%
2 1
14.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 12
63.2%
Common 7
36.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 3
25.0%
S 3
25.0%
G 1
 
8.3%
N 1
 
8.3%
U 1
 
8.3%
P 1
 
8.3%
C 1
 
8.3%
I 1
 
8.3%
Common
ValueCountFrequency (%)
5 2
28.6%
7 2
28.6%
3 1
14.3%
1 1
14.3%
2 1
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 19
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E 3
15.8%
S 3
15.8%
5 2
10.5%
7 2
10.5%
G 1
 
5.3%
N 1
 
5.3%
U 1
 
5.3%
3 1
 
5.3%
1 1
 
5.3%
2 1
 
5.3%
Other values (3) 3
15.8%
Distinct24
Distinct (%)0.3%
Missing2352474
Missing (%)99.6%
Memory size18.0 MiB
2025-01-08T17:46:41.951426image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length64
Median length3
Mean length4.292810312
Min length2

Characters and Unicode

Total characters38631
Distinct characters29
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)< 0.1%

Sample

1st rownear
2nd rowcf.
3rd rowcf.
4th rowvel aff.
5th rowvel aff.
ValueCountFrequency (%)
cf 6018
65.9%
uncertain 1593
 
17.4%
aff 939
 
10.3%
near 259
 
2.8%
s.l 121
 
1.3%
vel 93
 
1.0%
group 24
 
0.3%
subgroup 23
 
0.3%
sp 21
 
0.2%
nov 15
 
0.2%
Other values (12) 29
 
0.3%
2025-01-08T17:46:42.061974image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
f 7896
20.4%
c 7622
19.7%
. 7199
18.6%
n 3471
9.0%
a 2802
 
7.3%
e 1963
 
5.1%
r 1904
 
4.9%
u 1624
 
4.2%
t 1602
 
4.1%
i 1598
 
4.1%
Other values (19) 950
 
2.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 31242
80.9%
Other Punctuation 7203
 
18.6%
Space Separator 136
 
0.4%
Uppercase Letter 50
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
f 7896
25.3%
c 7622
24.4%
n 3471
11.1%
a 2802
 
9.0%
e 1963
 
6.3%
r 1904
 
6.1%
u 1624
 
5.2%
t 1602
 
5.1%
i 1598
 
5.1%
l 225
 
0.7%
Other values (10) 535
 
1.7%
Uppercase Letter
ValueCountFrequency (%)
U 44
88.0%
C 2
 
4.0%
B 1
 
2.0%
P 1
 
2.0%
D 1
 
2.0%
A 1
 
2.0%
Other Punctuation
ValueCountFrequency (%)
. 7199
99.9%
, 4
 
0.1%
Space Separator
ValueCountFrequency (%)
136
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 31292
81.0%
Common 7339
 
19.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
f 7896
25.2%
c 7622
24.4%
n 3471
11.1%
a 2802
 
9.0%
e 1963
 
6.3%
r 1904
 
6.1%
u 1624
 
5.2%
t 1602
 
5.1%
i 1598
 
5.1%
l 225
 
0.7%
Other values (16) 585
 
1.9%
Common
ValueCountFrequency (%)
. 7199
98.1%
136
 
1.9%
, 4
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 38631
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
f 7896
20.4%
c 7622
19.7%
. 7199
18.6%
n 3471
9.0%
a 2802
 
7.3%
e 1963
 
5.1%
r 1904
 
4.9%
u 1624
 
4.2%
t 1602
 
4.1%
i 1598
 
4.1%
Other values (19) 950
 
2.5%

typeStatus
Text

Missing 

Distinct20
Distinct (%)< 0.1%
Missing2274525
Missing (%)96.3%
Memory size18.0 MiB
2025-01-08T17:46:42.112232image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length34
Median length8
Mean length7.268436307
Min length4

Characters and Unicode

Total characters631976
Distinct characters33
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st rowTYPE
2nd rowHOLOTYPE
3rd rowTYPE
4th rowHOLOTYPE
5th rowHOLOTYPE
ValueCountFrequency (%)
holotype 26480
30.5%
paratype 19158
22.0%
isotype 15389
17.7%
type 12139
14.0%
syntype 7658
 
8.8%
lectotype 1798
 
2.1%
isosyntype 1550
 
1.8%
allotype 997
 
1.1%
isolectotype 518
 
0.6%
cotype 484
 
0.6%
Other values (13) 780
 
0.9%
2025-01-08T17:46:42.229328image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
P 106503
16.9%
Y 96139
15.2%
E 89952
14.2%
T 89665
14.2%
O 75084
11.9%
A 40178
 
6.4%
L 31157
 
4.9%
S 26767
 
4.2%
H 26546
 
4.2%
R 19532
 
3.1%
Other values (23) 30453
 
4.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 631940
> 99.9%
Lowercase Letter 31
 
< 0.1%
Space Separator 3
 
< 0.1%
Other Punctuation 2
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
P 106503
16.9%
Y 96139
15.2%
E 89952
14.2%
T 89665
14.2%
O 75084
11.9%
A 40178
 
6.4%
L 31157
 
4.9%
S 26767
 
4.2%
H 26546
 
4.2%
R 19532
 
3.1%
Other values (6) 30417
 
4.8%
Lowercase Letter
ValueCountFrequency (%)
a 9
29.0%
l 4
12.9%
n 3
 
9.7%
e 2
 
6.5%
u 2
 
6.5%
d 2
 
6.5%
i 2
 
6.5%
j 1
 
3.2%
r 1
 
3.2%
o 1
 
3.2%
Other values (4) 4
12.9%
Other Punctuation
ValueCountFrequency (%)
, 1
50.0%
. 1
50.0%
Space Separator
ValueCountFrequency (%)
3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 631971
> 99.9%
Common 5
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
P 106503
16.9%
Y 96139
15.2%
E 89952
14.2%
T 89665
14.2%
O 75084
11.9%
A 40178
 
6.4%
L 31157
 
4.9%
S 26767
 
4.2%
H 26546
 
4.2%
R 19532
 
3.1%
Other values (20) 30448
 
4.8%
Common
ValueCountFrequency (%)
3
60.0%
, 1
 
20.0%
. 1
 
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 631976
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
P 106503
16.9%
Y 96139
15.2%
E 89952
14.2%
T 89665
14.2%
O 75084
11.9%
A 40178
 
6.4%
L 31157
 
4.9%
S 26767
 
4.2%
H 26546
 
4.2%
R 19532
 
3.1%
Other values (23) 30453
 
4.8%

identifiedBy
Text

Missing 

Distinct15491
Distinct (%)3.8%
Missing1955406
Missing (%)82.8%
Memory size18.0 MiB
2025-01-08T17:46:42.408975image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length215
Median length136
Mean length36.86477601
Min length2

Characters and Unicode

Total characters14969569
Distinct characters107
Distinct categories12 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5875 ?
Unique (%)1.4%

Sample

1st rowBadley, J. E.
2nd rowStrong, M. T., (US), Smithsonian Institution - National Museum of Natural History (UNITED STATES)
3rd rowJohnson, M. W.
4th rowZibrowius, Helmut, (CNRS-UA 41), Centre d'Oceanologie de Marseille (CNRS-UA 41) (FRANCE)
5th rowFoster, W. D.
ValueCountFrequency (%)
of 101962
 
4.6%
museum 88049
 
3.9%
national 87412
 
3.9%
institution 84694
 
3.8%
smithsonian 84068
 
3.8%
natural 83876
 
3.8%
history 83747
 
3.7%
united 76183
 
3.4%
states 75967
 
3.4%
60502
 
2.7%
Other values (11543) 1407833
63.0%
2025-01-08T17:46:42.657995image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1828226
 
12.2%
a 896479
 
6.0%
t 890147
 
5.9%
i 878253
 
5.9%
n 819247
 
5.5%
o 804702
 
5.4%
e 660704
 
4.4%
, 639172
 
4.3%
r 634090
 
4.2%
s 583168
 
3.9%
Other values (97) 6335381
42.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 8553794
57.1%
Uppercase Letter 3024063
 
20.2%
Space Separator 1828226
 
12.2%
Other Punctuation 1178772
 
7.9%
Open Punctuation 155751
 
1.0%
Close Punctuation 155751
 
1.0%
Dash Punctuation 71705
 
0.5%
Decimal Number 1459
 
< 0.1%
Math Symbol 25
 
< 0.1%
Initial Punctuation 11
 
< 0.1%
Other values (2) 12
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 896479
10.5%
t 890147
10.4%
i 878253
10.3%
n 819247
9.6%
o 804702
9.4%
e 660704
7.7%
r 634090
7.4%
s 583168
 
6.8%
u 486436
 
5.7%
l 429180
 
5.0%
Other values (41) 1471388
17.2%
Uppercase Letter
ValueCountFrequency (%)
S 355379
11.8%
T 298710
 
9.9%
N 288522
 
9.5%
E 221718
 
7.3%
M 209359
 
6.9%
I 204015
 
6.7%
A 174366
 
5.8%
H 172820
 
5.7%
D 153344
 
5.1%
U 121907
 
4.0%
Other values (20) 823923
27.2%
Other Punctuation
ValueCountFrequency (%)
, 639172
54.2%
. 508849
43.2%
; 23149
 
2.0%
/ 4267
 
0.4%
& 1664
 
0.1%
' 1327
 
0.1%
" 332
 
< 0.1%
¡ 8
 
< 0.1%
? 4
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 673
46.1%
4 672
46.1%
2 45
 
3.1%
9 24
 
1.6%
0 23
 
1.6%
6 22
 
1.5%
Open Punctuation
ValueCountFrequency (%)
( 155163
99.6%
[ 588
 
0.4%
Close Punctuation
ValueCountFrequency (%)
) 155163
99.6%
] 588
 
0.4%
Dash Punctuation
ValueCountFrequency (%)
- 71702
> 99.9%
3
 
< 0.1%
Space Separator
ValueCountFrequency (%)
1828226
100.0%
Math Symbol
ValueCountFrequency (%)
+ 25
100.0%
Initial Punctuation
ValueCountFrequency (%)
11
100.0%
Final Punctuation
ValueCountFrequency (%)
11
100.0%
Currency Symbol
ValueCountFrequency (%)
¢ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 11577857
77.3%
Common 3391712
 
22.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 896479
 
7.7%
t 890147
 
7.7%
i 878253
 
7.6%
n 819247
 
7.1%
o 804702
 
7.0%
e 660704
 
5.7%
r 634090
 
5.5%
s 583168
 
5.0%
u 486436
 
4.2%
l 429180
 
3.7%
Other values (71) 4495451
38.8%
Common
ValueCountFrequency (%)
1828226
53.9%
, 639172
 
18.8%
. 508849
 
15.0%
( 155163
 
4.6%
) 155163
 
4.6%
- 71702
 
2.1%
; 23149
 
0.7%
/ 4267
 
0.1%
& 1664
 
< 0.1%
' 1327
 
< 0.1%
Other values (16) 3030
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 14959867
99.9%
None 9677
 
0.1%
Punctuation 25
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1828226
 
12.2%
a 896479
 
6.0%
t 890147
 
6.0%
i 878253
 
5.9%
n 819247
 
5.5%
o 804702
 
5.4%
e 660704
 
4.4%
, 639172
 
4.3%
r 634090
 
4.2%
s 583168
 
3.9%
Other values (63) 6325679
42.3%
None
ValueCountFrequency (%)
í 5003
51.7%
é 1178
 
12.2%
á 1173
 
12.1%
ñ 499
 
5.2%
ö 450
 
4.7%
ü 301
 
3.1%
ó 273
 
2.8%
ä 216
 
2.2%
ã 195
 
2.0%
ú 82
 
0.8%
Other values (21) 307
 
3.2%
Punctuation
ValueCountFrequency (%)
11
44.0%
11
44.0%
3
 
12.0%

identifiedByID
Text

Missing 

Distinct2
Distinct (%)66.7%
Missing2361470
Missing (%)> 99.9%
Memory size18.0 MiB
2025-01-08T17:46:42.708597image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length8
Mean length9.666666667
Min length8

Characters and Unicode

Total characters29
Distinct characters16
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)33.3%

Sample

1st rowACCEPTED
2nd rowMagnoliopsida
3rd rowACCEPTED
ValueCountFrequency (%)
accepted 2
66.7%
magnoliopsida 1
33.3%
2025-01-08T17:46:42.802279image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
C 4
13.8%
E 4
13.8%
A 2
 
6.9%
P 2
 
6.9%
T 2
 
6.9%
D 2
 
6.9%
a 2
 
6.9%
o 2
 
6.9%
i 2
 
6.9%
M 1
 
3.4%
Other values (6) 6
20.7%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 17
58.6%
Lowercase Letter 12
41.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 2
16.7%
o 2
16.7%
i 2
16.7%
g 1
8.3%
n 1
8.3%
l 1
8.3%
p 1
8.3%
s 1
8.3%
d 1
8.3%
Uppercase Letter
ValueCountFrequency (%)
C 4
23.5%
E 4
23.5%
A 2
11.8%
P 2
11.8%
T 2
11.8%
D 2
11.8%
M 1
 
5.9%

Most occurring scripts

ValueCountFrequency (%)
Latin 29
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
C 4
13.8%
E 4
13.8%
A 2
 
6.9%
P 2
 
6.9%
T 2
 
6.9%
D 2
 
6.9%
a 2
 
6.9%
o 2
 
6.9%
i 2
 
6.9%
M 1
 
3.4%
Other values (6) 6
20.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 29
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
C 4
13.8%
E 4
13.8%
A 2
 
6.9%
P 2
 
6.9%
T 2
 
6.9%
D 2
 
6.9%
a 2
 
6.9%
o 2
 
6.9%
i 2
 
6.9%
M 1
 
3.4%
Other values (6) 6
20.7%

dateIdentified
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing2361472
Missing (%)> 99.9%
Memory size18.0 MiB
2025-01-08T17:46:42.843686image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters9
Distinct characters7
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowAsterales
ValueCountFrequency (%)
asterales 1
100.0%
2025-01-08T17:46:42.934226image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
s 2
22.2%
e 2
22.2%
A 1
11.1%
t 1
11.1%
r 1
11.1%
a 1
11.1%
l 1
11.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 8
88.9%
Uppercase Letter 1
 
11.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 2
25.0%
e 2
25.0%
t 1
12.5%
r 1
12.5%
a 1
12.5%
l 1
12.5%
Uppercase Letter
ValueCountFrequency (%)
A 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 9
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 2
22.2%
e 2
22.2%
A 1
11.1%
t 1
11.1%
r 1
11.1%
a 1
11.1%
l 1
11.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s 2
22.2%
e 2
22.2%
A 1
11.1%
t 1
11.1%
r 1
11.1%
a 1
11.1%
l 1
11.1%

identificationReferences
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing2361472
Missing (%)> 99.9%
Memory size18.0 MiB
2025-01-08T17:46:42.978226image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length37
Median length37
Mean length37
Min length37

Characters and Unicode

Total characters37
Distinct characters22
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowGuatteria punctata (Aubl.) R.A.Howard
ValueCountFrequency (%)
guatteria 1
25.0%
punctata 1
25.0%
aubl 1
25.0%
r.a.howard 1
25.0%
2025-01-08T17:46:43.074049image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 5
13.5%
t 4
 
10.8%
3
 
8.1%
u 3
 
8.1%
. 3
 
8.1%
r 2
 
5.4%
A 2
 
5.4%
G 1
 
2.7%
l 1
 
2.7%
w 1
 
2.7%
Other values (12) 12
32.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 24
64.9%
Uppercase Letter 5
 
13.5%
Space Separator 3
 
8.1%
Other Punctuation 3
 
8.1%
Close Punctuation 1
 
2.7%
Open Punctuation 1
 
2.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 5
20.8%
t 4
16.7%
u 3
12.5%
r 2
 
8.3%
l 1
 
4.2%
w 1
 
4.2%
o 1
 
4.2%
b 1
 
4.2%
c 1
 
4.2%
n 1
 
4.2%
Other values (4) 4
16.7%
Uppercase Letter
ValueCountFrequency (%)
A 2
40.0%
G 1
20.0%
H 1
20.0%
R 1
20.0%
Space Separator
ValueCountFrequency (%)
3
100.0%
Other Punctuation
ValueCountFrequency (%)
. 3
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 29
78.4%
Common 8
 
21.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 5
17.2%
t 4
13.8%
u 3
 
10.3%
r 2
 
6.9%
A 2
 
6.9%
G 1
 
3.4%
l 1
 
3.4%
w 1
 
3.4%
o 1
 
3.4%
H 1
 
3.4%
Other values (8) 8
27.6%
Common
ValueCountFrequency (%)
3
37.5%
. 3
37.5%
) 1
 
12.5%
( 1
 
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 37
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 5
13.5%
t 4
 
10.8%
3
 
8.1%
u 3
 
8.1%
. 3
 
8.1%
r 2
 
5.4%
A 2
 
5.4%
G 1
 
2.7%
l 1
 
2.7%
w 1
 
2.7%
Other values (12) 12
32.4%
Distinct6
Distinct (%)85.7%
Missing2361466
Missing (%)> 99.9%
Memory size18.0 MiB
2025-01-08T17:46:43.130431image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length36
Median length7
Mean length16.14285714
Min length7

Characters and Unicode

Total characters113
Distinct characters23
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5 ?
Unique (%)71.4%

Sample

1st row821cc27a-e3bb-4bc5-ac34-89ada245069d
2nd row24.9115
3rd row34.7745
4th rowCampanulaceae
5th row59.4381
ValueCountFrequency (%)
821cc27a-e3bb-4bc5-ac34-89ada245069d 2
28.6%
24.9115 1
14.3%
34.7745 1
14.3%
campanulaceae 1
14.3%
59.4381 1
14.3%
41.5265 1
14.3%
2025-01-08T17:46:43.233805image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 12
 
10.6%
4 11
 
9.7%
5 9
 
8.0%
c 9
 
8.0%
- 8
 
7.1%
2 8
 
7.1%
1 6
 
5.3%
9 6
 
5.3%
3 6
 
5.3%
b 6
 
5.3%
Other values (13) 32
28.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 60
53.1%
Lowercase Letter 40
35.4%
Dash Punctuation 8
 
7.1%
Other Punctuation 4
 
3.5%
Uppercase Letter 1
 
0.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 12
30.0%
c 9
22.5%
b 6
15.0%
e 4
 
10.0%
d 4
 
10.0%
m 1
 
2.5%
p 1
 
2.5%
n 1
 
2.5%
u 1
 
2.5%
l 1
 
2.5%
Decimal Number
ValueCountFrequency (%)
4 11
18.3%
5 9
15.0%
2 8
13.3%
1 6
10.0%
9 6
10.0%
3 6
10.0%
8 5
8.3%
7 4
 
6.7%
6 3
 
5.0%
0 2
 
3.3%
Dash Punctuation
ValueCountFrequency (%)
- 8
100.0%
Other Punctuation
ValueCountFrequency (%)
. 4
100.0%
Uppercase Letter
ValueCountFrequency (%)
C 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 72
63.7%
Latin 41
36.3%

Most frequent character per script

Common
ValueCountFrequency (%)
4 11
15.3%
5 9
12.5%
- 8
11.1%
2 8
11.1%
1 6
8.3%
9 6
8.3%
3 6
8.3%
8 5
6.9%
7 4
 
5.6%
. 4
 
5.6%
Other values (2) 5
6.9%
Latin
ValueCountFrequency (%)
a 12
29.3%
c 9
22.0%
b 6
14.6%
e 4
 
9.8%
d 4
 
9.8%
C 1
 
2.4%
m 1
 
2.4%
p 1
 
2.4%
n 1
 
2.4%
u 1
 
2.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 113
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 12
 
10.6%
4 11
 
9.7%
5 9
 
8.0%
c 9
 
8.0%
- 8
 
7.1%
2 8
 
7.1%
1 6
 
5.3%
9 6
 
5.3%
3 6
 
5.3%
b 6
 
5.3%
Other values (13) 32
28.3%

identificationRemarks
Text

Missing 

Distinct5
Distinct (%)83.3%
Missing2361467
Missing (%)> 99.9%
Memory size18.0 MiB
2025-01-08T17:46:43.277805image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length7
Mean length5.666666667
Min length2

Characters and Unicode

Total characters34
Distinct characters13
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)66.7%

Sample

1st rowUS
2nd row121.73
3rd row-96.6783
4th row-151.711
5th row-70.6731
ValueCountFrequency (%)
us 2
33.3%
121.73 1
16.7%
96.6783 1
16.7%
151.711 1
16.7%
70.6731 1
16.7%
2025-01-08T17:46:43.374788image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 7
20.6%
7 5
14.7%
. 4
11.8%
3 3
8.8%
- 3
8.8%
6 3
8.8%
U 2
 
5.9%
S 2
 
5.9%
2 1
 
2.9%
9 1
 
2.9%
Other values (3) 3
8.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 23
67.6%
Other Punctuation 4
 
11.8%
Uppercase Letter 4
 
11.8%
Dash Punctuation 3
 
8.8%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 7
30.4%
7 5
21.7%
3 3
13.0%
6 3
13.0%
2 1
 
4.3%
9 1
 
4.3%
8 1
 
4.3%
5 1
 
4.3%
0 1
 
4.3%
Uppercase Letter
ValueCountFrequency (%)
U 2
50.0%
S 2
50.0%
Other Punctuation
ValueCountFrequency (%)
. 4
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 30
88.2%
Latin 4
 
11.8%

Most frequent character per script

Common
ValueCountFrequency (%)
1 7
23.3%
7 5
16.7%
. 4
13.3%
3 3
10.0%
- 3
10.0%
6 3
10.0%
2 1
 
3.3%
9 1
 
3.3%
8 1
 
3.3%
5 1
 
3.3%
Latin
ValueCountFrequency (%)
U 2
50.0%
S 2
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 34
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 7
20.6%
7 5
14.7%
. 4
11.8%
3 3
8.8%
- 3
8.8%
6 3
8.8%
U 2
 
5.9%
S 2
 
5.9%
2 1
 
2.9%
9 1
 
2.9%
Other values (3) 3
8.8%

taxonID
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing2361471
Missing (%)> 99.9%
Memory size18.0 MiB
2025-01-08T17:46:43.426275image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length24
Median length24
Mean length24
Min length24

Characters and Unicode

Total characters48
Distinct characters14
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st row2024-12-02T13:57:05.005Z
2nd row2024-12-02T13:57:45.829Z
ValueCountFrequency (%)
2024-12-02t13:57:05.005z 1
50.0%
2024-12-02t13:57:45.829z 1
50.0%
2025-01-08T17:46:43.525423image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 9
18.8%
0 7
14.6%
5 5
10.4%
- 4
8.3%
1 4
8.3%
: 4
8.3%
4 3
 
6.2%
T 2
 
4.2%
3 2
 
4.2%
7 2
 
4.2%
Other values (4) 6
12.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 34
70.8%
Other Punctuation 6
 
12.5%
Dash Punctuation 4
 
8.3%
Uppercase Letter 4
 
8.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 9
26.5%
0 7
20.6%
5 5
14.7%
1 4
11.8%
4 3
 
8.8%
3 2
 
5.9%
7 2
 
5.9%
8 1
 
2.9%
9 1
 
2.9%
Other Punctuation
ValueCountFrequency (%)
: 4
66.7%
. 2
33.3%
Uppercase Letter
ValueCountFrequency (%)
T 2
50.0%
Z 2
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 44
91.7%
Latin 4
 
8.3%

Most frequent character per script

Common
ValueCountFrequency (%)
2 9
20.5%
0 7
15.9%
5 5
11.4%
- 4
9.1%
1 4
9.1%
: 4
9.1%
4 3
 
6.8%
3 2
 
4.5%
7 2
 
4.5%
. 2
 
4.5%
Other values (2) 2
 
4.5%
Latin
ValueCountFrequency (%)
T 2
50.0%
Z 2
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 48
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 9
18.8%
0 7
14.6%
5 5
10.4%
- 4
8.3%
1 4
8.3%
: 4
8.3%
4 3
 
6.2%
T 2
 
4.2%
3 2
 
4.2%
7 2
 
4.2%
Other values (4) 6
12.5%

scientificNameID
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing2361472
Missing (%)> 99.9%
Memory size18.0 MiB
2025-01-08T17:46:43.567723image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters4
Distinct characters4
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row69.0
ValueCountFrequency (%)
69.0 1
100.0%
2025-01-08T17:46:43.653193image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6 1
25.0%
9 1
25.0%
. 1
25.0%
0 1
25.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3
75.0%
Other Punctuation 1
 
25.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
6 1
33.3%
9 1
33.3%
0 1
33.3%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 4
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
6 1
25.0%
9 1
25.0%
. 1
25.0%
0 1
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
6 1
25.0%
9 1
25.0%
. 1
25.0%
0 1
25.0%
Distinct315017
Distinct (%)13.4%
Missing5774
Missing (%)0.2%
Memory size18.0 MiB
2025-01-08T17:46:43.891065image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length9
Median length7
Mean length6.879701948
Min length1

Characters and Unicode

Total characters16206507
Distinct characters17
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique142213 ?
Unique (%)6.0%

Sample

1st row3869
2nd row3044413
3rd row2431199
4th row714
5th row2322812
ValueCountFrequency (%)
2431491 19390
 
0.8%
225 6083
 
0.3%
7947184 4743
 
0.2%
5967481 3865
 
0.2%
2437967 3815
 
0.2%
2431539 3260
 
0.1%
2440447 2987
 
0.1%
105 2810
 
0.1%
1340278 2739
 
0.1%
2431224 2562
 
0.1%
Other values (315007) 2303445
97.8%
2025-01-08T17:46:44.204589image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 2439480
15.1%
3 1778291
11.0%
4 1658961
10.2%
1 1632514
10.1%
5 1571567
9.7%
7 1538551
9.5%
8 1409644
8.7%
6 1402728
8.7%
9 1400354
8.6%
0 1374408
8.5%
Other values (7) 9
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 16206498
> 99.9%
Lowercase Letter 8
 
< 0.1%
Uppercase Letter 1
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 2439480
15.1%
3 1778291
11.0%
4 1658961
10.2%
1 1632514
10.1%
5 1571567
9.7%
7 1538551
9.5%
8 1409644
8.7%
6 1402728
8.7%
9 1400354
8.6%
0 1374408
8.5%
Lowercase Letter
ValueCountFrequency (%)
a 3
37.5%
m 1
 
12.5%
p 1
 
12.5%
n 1
 
12.5%
u 1
 
12.5%
l 1
 
12.5%
Uppercase Letter
ValueCountFrequency (%)
C 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 16206498
> 99.9%
Latin 9
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
2 2439480
15.1%
3 1778291
11.0%
4 1658961
10.2%
1 1632514
10.1%
5 1571567
9.7%
7 1538551
9.5%
8 1409644
8.7%
6 1402728
8.7%
9 1400354
8.6%
0 1374408
8.5%
Latin
ValueCountFrequency (%)
a 3
33.3%
C 1
 
11.1%
m 1
 
11.1%
p 1
 
11.1%
n 1
 
11.1%
u 1
 
11.1%
l 1
 
11.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 16206507
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 2439480
15.1%
3 1778291
11.0%
4 1658961
10.2%
1 1632514
10.1%
5 1571567
9.7%
7 1538551
9.5%
8 1409644
8.7%
6 1402728
8.7%
9 1400354
8.6%
0 1374408
8.5%
Other values (7) 9
 
< 0.1%

parentNameUsageID
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing2361472
Missing (%)> 99.9%
Memory size18.0 MiB
2025-01-08T17:46:44.257701image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters9
Distinct characters7
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowCampanula
ValueCountFrequency (%)
campanula 1
100.0%
2025-01-08T17:46:44.345603image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 3
33.3%
C 1
 
11.1%
m 1
 
11.1%
p 1
 
11.1%
n 1
 
11.1%
u 1
 
11.1%
l 1
 
11.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 8
88.9%
Uppercase Letter 1
 
11.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 3
37.5%
m 1
 
12.5%
p 1
 
12.5%
n 1
 
12.5%
u 1
 
12.5%
l 1
 
12.5%
Uppercase Letter
ValueCountFrequency (%)
C 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 9
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 3
33.3%
C 1
 
11.1%
m 1
 
11.1%
p 1
 
11.1%
n 1
 
11.1%
u 1
 
11.1%
l 1
 
11.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 3
33.3%
C 1
 
11.1%
m 1
 
11.1%
p 1
 
11.1%
n 1
 
11.1%
u 1
 
11.1%
l 1
 
11.1%

originalNameUsageID
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing2361472
Missing (%)> 99.9%
Memory size18.0 MiB
2025-01-08T17:46:44.397605image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length68
Median length68
Mean length68
Min length68

Characters and Unicode

Total characters68
Distinct characters21
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowPlantae, Dicotyledonae (basal), Magnoliales, Annonaceae, Annonoideae
ValueCountFrequency (%)
plantae 1
16.7%
dicotyledonae 1
16.7%
basal 1
16.7%
magnoliales 1
16.7%
annonaceae 1
16.7%
annonoideae 1
16.7%
2025-01-08T17:46:44.505083image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 10
14.7%
n 9
13.2%
e 8
11.8%
o 6
8.8%
5
 
7.4%
l 5
 
7.4%
, 4
 
5.9%
i 3
 
4.4%
c 2
 
2.9%
s 2
 
2.9%
Other values (11) 14
20.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 52
76.5%
Space Separator 5
 
7.4%
Uppercase Letter 5
 
7.4%
Other Punctuation 4
 
5.9%
Open Punctuation 1
 
1.5%
Close Punctuation 1
 
1.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 10
19.2%
n 9
17.3%
e 8
15.4%
o 6
11.5%
l 5
9.6%
i 3
 
5.8%
c 2
 
3.8%
s 2
 
3.8%
d 2
 
3.8%
t 2
 
3.8%
Other values (3) 3
 
5.8%
Uppercase Letter
ValueCountFrequency (%)
A 2
40.0%
D 1
20.0%
M 1
20.0%
P 1
20.0%
Space Separator
ValueCountFrequency (%)
5
100.0%
Other Punctuation
ValueCountFrequency (%)
, 4
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 57
83.8%
Common 11
 
16.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 10
17.5%
n 9
15.8%
e 8
14.0%
o 6
10.5%
l 5
8.8%
i 3
 
5.3%
c 2
 
3.5%
s 2
 
3.5%
d 2
 
3.5%
A 2
 
3.5%
Other values (7) 8
14.0%
Common
ValueCountFrequency (%)
5
45.5%
, 4
36.4%
( 1
 
9.1%
) 1
 
9.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 68
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 10
14.7%
n 9
13.2%
e 8
11.8%
o 6
8.8%
5
 
7.4%
l 5
 
7.4%
, 4
 
5.9%
i 3
 
4.4%
c 2
 
2.9%
s 2
 
2.9%
Other values (11) 14
20.6%

nameAccordingToID
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing2361472
Missing (%)> 99.9%
Memory size18.0 MiB
2025-01-08T17:46:44.549158image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters7
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowPlantae
ValueCountFrequency (%)
plantae 1
100.0%
2025-01-08T17:46:44.636666image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 2
28.6%
P 1
14.3%
l 1
14.3%
n 1
14.3%
t 1
14.3%
e 1
14.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 6
85.7%
Uppercase Letter 1
 
14.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 2
33.3%
l 1
16.7%
n 1
16.7%
t 1
16.7%
e 1
16.7%
Uppercase Letter
ValueCountFrequency (%)
P 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 7
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 2
28.6%
P 1
14.3%
l 1
14.3%
n 1
14.3%
t 1
14.3%
e 1
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 2
28.6%
P 1
14.3%
l 1
14.3%
n 1
14.3%
t 1
14.3%
e 1
14.3%

namePublishedInID
Text

Missing 

Distinct4
Distinct (%)100.0%
Missing2361469
Missing (%)> 99.9%
Memory size18.0 MiB
2025-01-08T17:46:44.697345image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length100
Median length74
Mean length43
Min length12

Characters and Unicode

Total characters172
Distinct characters36
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)100.0%

Sample

1st rowOCCURRENCE_STATUS_INFERRED_FROM_INDIVIDUAL_COUNT;GEODETIC_DATUM_ASSUMED_WGS84;GEODETIC_DATUM_INVALID
2nd rowrotundifolia
3rd rowTracheophyta
4th rowOCCURRENCE_STATUS_INFERRED_FROM_INDIVIDUAL_COUNT
ValueCountFrequency (%)
occurrence_status_inferred_from_individual_count;geodetic_datum_assumed_wgs84;geodetic_datum_invalid 1
25.0%
rotundifolia 1
25.0%
tracheophyta 1
25.0%
occurrence_status_inferred_from_individual_count 1
25.0%
2025-01-08T17:46:44.808171image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
_ 15
 
8.7%
E 13
 
7.6%
D 12
 
7.0%
I 12
 
7.0%
T 11
 
6.4%
U 11
 
6.4%
C 10
 
5.8%
R 10
 
5.8%
N 9
 
5.2%
A 8
 
4.7%
Other values (26) 61
35.5%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 130
75.6%
Lowercase Letter 23
 
13.4%
Connector Punctuation 15
 
8.7%
Other Punctuation 2
 
1.2%
Decimal Number 2
 
1.2%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 13
10.0%
D 12
9.2%
I 12
9.2%
T 11
8.5%
U 11
8.5%
C 10
 
7.7%
R 10
 
7.7%
N 9
 
6.9%
A 8
 
6.2%
O 8
 
6.2%
Other values (7) 26
20.0%
Lowercase Letter
ValueCountFrequency (%)
a 3
13.0%
o 3
13.0%
r 2
 
8.7%
t 2
 
8.7%
h 2
 
8.7%
i 2
 
8.7%
p 1
 
4.3%
e 1
 
4.3%
c 1
 
4.3%
l 1
 
4.3%
Other values (5) 5
21.7%
Decimal Number
ValueCountFrequency (%)
4 1
50.0%
8 1
50.0%
Connector Punctuation
ValueCountFrequency (%)
_ 15
100.0%
Other Punctuation
ValueCountFrequency (%)
; 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 153
89.0%
Common 19
 
11.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 13
 
8.5%
D 12
 
7.8%
I 12
 
7.8%
T 11
 
7.2%
U 11
 
7.2%
C 10
 
6.5%
R 10
 
6.5%
N 9
 
5.9%
A 8
 
5.2%
O 8
 
5.2%
Other values (22) 49
32.0%
Common
ValueCountFrequency (%)
_ 15
78.9%
; 2
 
10.5%
4 1
 
5.3%
8 1
 
5.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 172
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
_ 15
 
8.7%
E 13
 
7.6%
D 12
 
7.0%
I 12
 
7.0%
T 11
 
6.4%
U 11
 
6.4%
C 10
 
5.8%
R 10
 
5.8%
N 9
 
5.2%
A 8
 
4.7%
Other values (26) 61
35.5%

taxonConceptID
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing2361471
Missing (%)> 99.9%
Memory size18.0 MiB
2025-01-08T17:46:44.851552image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length11.5
Mean length11.5
Min length10

Characters and Unicode

Total characters23
Distinct characters15
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st rowMagnoliopsida
2nd rowStillImage
ValueCountFrequency (%)
magnoliopsida 1
50.0%
stillimage 1
50.0%
2025-01-08T17:46:44.954723image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 3
13.0%
l 3
13.0%
i 3
13.0%
g 2
 
8.7%
o 2
 
8.7%
M 1
 
4.3%
n 1
 
4.3%
p 1
 
4.3%
s 1
 
4.3%
d 1
 
4.3%
Other values (5) 5
21.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 20
87.0%
Uppercase Letter 3
 
13.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 3
15.0%
l 3
15.0%
i 3
15.0%
g 2
10.0%
o 2
10.0%
n 1
 
5.0%
p 1
 
5.0%
s 1
 
5.0%
d 1
 
5.0%
t 1
 
5.0%
Other values (2) 2
10.0%
Uppercase Letter
ValueCountFrequency (%)
M 1
33.3%
S 1
33.3%
I 1
33.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 23
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 3
13.0%
l 3
13.0%
i 3
13.0%
g 2
 
8.7%
o 2
 
8.7%
M 1
 
4.3%
n 1
 
4.3%
p 1
 
4.3%
s 1
 
4.3%
d 1
 
4.3%
Other values (5) 5
21.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 23
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 3
13.0%
l 3
13.0%
i 3
13.0%
g 2
 
8.7%
o 2
 
8.7%
M 1
 
4.3%
n 1
 
4.3%
p 1
 
4.3%
s 1
 
4.3%
d 1
 
4.3%
Other values (5) 5
21.7%
Distinct362008
Distinct (%)15.3%
Missing10
Missing (%)< 0.1%
Memory size18.0 MiB
2025-01-08T17:46:45.202057image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length234
Median length109
Mean length31.78332034
Min length4

Characters and Unicode

Total characters75055135
Distinct characters128
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique179316 ?
Unique (%)7.6%

Sample

1st rowHippolytidae
2nd rowLesquerella lescurii (A.Gray) S.Watson
3rd rowDesmognathus ochrophaeus Cope, 1859
4th rowScleractinia
5th rowNinoe kinbergi Ehlers, 1887
ValueCountFrequency (%)
238208
 
2.6%
l 180173
 
2.0%
ex 82914
 
0.9%
linnaeus 79974
 
0.9%
1758 62066
 
0.7%
var 50926
 
0.6%
plethodon 42963
 
0.5%
1818 33176
 
0.4%
kunth 29864
 
0.3%
dc 29734
 
0.3%
Other values (185831) 8293415
90.9%
2025-01-08T17:46:45.537064image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6761950
 
9.0%
a 6242590
 
8.3%
i 5013451
 
6.7%
e 4703897
 
6.3%
r 3970577
 
5.3%
s 3847327
 
5.1%
o 3620560
 
4.8%
n 3486756
 
4.6%
l 3394828
 
4.5%
u 2955157
 
3.9%
Other values (118) 31058042
41.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 52793618
70.3%
Space Separator 6761950
 
9.0%
Uppercase Letter 6033053
 
8.0%
Decimal Number 4371020
 
5.8%
Other Punctuation 3122342
 
4.2%
Close Punctuation 972201
 
1.3%
Open Punctuation 972201
 
1.3%
Dash Punctuation 25650
 
< 0.1%
Math Symbol 3079
 
< 0.1%
Connector Punctuation 21
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 6242590
11.8%
i 5013451
 
9.5%
e 4703897
 
8.9%
r 3970577
 
7.5%
s 3847327
 
7.3%
o 3620560
 
6.9%
n 3486756
 
6.6%
l 3394828
 
6.4%
u 2955157
 
5.6%
t 2758965
 
5.2%
Other values (56) 12799510
24.2%
Uppercase Letter
ValueCountFrequency (%)
L 571632
 
9.5%
S 558239
 
9.3%
C 547216
 
9.1%
P 505361
 
8.4%
A 411245
 
6.8%
B 404229
 
6.7%
M 403339
 
6.7%
H 337256
 
5.6%
G 316928
 
5.3%
D 272745
 
4.5%
Other values (31) 1704863
28.3%
Decimal Number
ValueCountFrequency (%)
1 1310245
30.0%
8 913594
20.9%
9 482023
 
11.0%
7 349542
 
8.0%
5 252107
 
5.8%
0 229816
 
5.3%
2 222891
 
5.1%
6 220119
 
5.0%
3 202794
 
4.6%
4 187889
 
4.3%
Other Punctuation
ValueCountFrequency (%)
. 1769758
56.7%
, 1109322
35.5%
& 238207
 
7.6%
' 5054
 
0.2%
? 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
6761950
100.0%
Close Punctuation
ValueCountFrequency (%)
) 972201
100.0%
Open Punctuation
ValueCountFrequency (%)
( 972201
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 25650
100.0%
Math Symbol
ValueCountFrequency (%)
× 3079
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 21
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 58826671
78.4%
Common 16228464
 
21.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 6242590
 
10.6%
i 5013451
 
8.5%
e 4703897
 
8.0%
r 3970577
 
6.7%
s 3847327
 
6.5%
o 3620560
 
6.2%
n 3486756
 
5.9%
l 3394828
 
5.8%
u 2955157
 
5.0%
t 2758965
 
4.7%
Other values (97) 18832563
32.0%
Common
ValueCountFrequency (%)
6761950
41.7%
. 1769758
 
10.9%
1 1310245
 
8.1%
, 1109322
 
6.8%
) 972201
 
6.0%
( 972201
 
6.0%
8 913594
 
5.6%
9 482023
 
3.0%
7 349542
 
2.2%
5 252107
 
1.6%
Other values (11) 1335521
 
8.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 74931506
99.8%
None 123629
 
0.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
6761950
 
9.0%
a 6242590
 
8.3%
i 5013451
 
6.7%
e 4703897
 
6.3%
r 3970577
 
5.3%
s 3847327
 
5.1%
o 3620560
 
4.8%
n 3486756
 
4.7%
l 3394828
 
4.5%
u 2955157
 
3.9%
Other values (62) 30934413
41.3%
None
ValueCountFrequency (%)
ü 38809
31.4%
é 27121
21.9%
ö 16572
13.4%
è 11041
 
8.9%
å 4789
 
3.9%
ø 3665
 
3.0%
ä 3627
 
2.9%
á 3625
 
2.9%
× 3079
 
2.5%
Á 2054
 
1.7%
Other values (46) 9247
 
7.5%

acceptedNameUsage
Text

Missing 

Distinct2
Distinct (%)66.7%
Missing2361470
Missing (%)> 99.9%
Memory size18.0 MiB
2025-01-08T17:46:45.594065image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length5
Mean length5.666666667
Min length5

Characters and Unicode

Total characters17
Distinct characters10
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)33.3%

Sample

1st rowfalse
2nd rowSPECIES
3rd rowfalse
ValueCountFrequency (%)
false 2
66.7%
species 1
33.3%
2025-01-08T17:46:45.838388image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
f 2
11.8%
a 2
11.8%
l 2
11.8%
s 2
11.8%
e 2
11.8%
S 2
11.8%
E 2
11.8%
P 1
5.9%
C 1
5.9%
I 1
5.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 10
58.8%
Uppercase Letter 7
41.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
f 2
20.0%
a 2
20.0%
l 2
20.0%
s 2
20.0%
e 2
20.0%
Uppercase Letter
ValueCountFrequency (%)
S 2
28.6%
E 2
28.6%
P 1
14.3%
C 1
14.3%
I 1
14.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 17
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
f 2
11.8%
a 2
11.8%
l 2
11.8%
s 2
11.8%
e 2
11.8%
S 2
11.8%
E 2
11.8%
P 1
5.9%
C 1
5.9%
I 1
5.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 17
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
f 2
11.8%
a 2
11.8%
l 2
11.8%
s 2
11.8%
e 2
11.8%
S 2
11.8%
E 2
11.8%
P 1
5.9%
C 1
5.9%
I 1
5.9%

parentNameUsage
Text

Missing 

Distinct4
Distinct (%)100.0%
Missing2361469
Missing (%)> 99.9%
Memory size18.0 MiB
2025-01-08T17:46:45.890900image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length9.5
Mean length8.25
Min length7

Characters and Unicode

Total characters33
Distinct characters19
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)100.0%

Sample

1st row3190721
2nd rowGEOLocate
3rd rowAnnonaceae
4th row3869031
ValueCountFrequency (%)
3190721 1
25.0%
geolocate 1
25.0%
annonaceae 1
25.0%
3869031 1
25.0%
2025-01-08T17:46:46.008628image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3 3
 
9.1%
a 3
 
9.1%
n 3
 
9.1%
e 3
 
9.1%
1 3
 
9.1%
9 2
 
6.1%
0 2
 
6.1%
o 2
 
6.1%
c 2
 
6.1%
8 1
 
3.0%
Other values (9) 9
27.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 14
42.4%
Lowercase Letter 14
42.4%
Uppercase Letter 5
 
15.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 3
21.4%
1 3
21.4%
9 2
14.3%
0 2
14.3%
8 1
 
7.1%
2 1
 
7.1%
7 1
 
7.1%
6 1
 
7.1%
Lowercase Letter
ValueCountFrequency (%)
a 3
21.4%
n 3
21.4%
e 3
21.4%
o 2
14.3%
c 2
14.3%
t 1
 
7.1%
Uppercase Letter
ValueCountFrequency (%)
A 1
20.0%
L 1
20.0%
O 1
20.0%
E 1
20.0%
G 1
20.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 19
57.6%
Common 14
42.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 3
15.8%
n 3
15.8%
e 3
15.8%
o 2
10.5%
c 2
10.5%
A 1
 
5.3%
t 1
 
5.3%
L 1
 
5.3%
O 1
 
5.3%
E 1
 
5.3%
Common
ValueCountFrequency (%)
3 3
21.4%
1 3
21.4%
9 2
14.3%
0 2
14.3%
8 1
 
7.1%
2 1
 
7.1%
7 1
 
7.1%
6 1
 
7.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 33
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3 3
 
9.1%
a 3
 
9.1%
n 3
 
9.1%
e 3
 
9.1%
1 3
 
9.1%
9 2
 
6.1%
0 2
 
6.1%
o 2
 
6.1%
c 2
 
6.1%
8 1
 
3.0%
Other values (9) 9
27.3%

originalNameUsage
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing2361471
Missing (%)> 99.9%
Memory size18.0 MiB
2025-01-08T17:46:46.057923image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters14
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st row3190721
2nd row3869031
ValueCountFrequency (%)
3190721 1
50.0%
3869031 1
50.0%
2025-01-08T17:46:46.157553image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3 3
21.4%
1 3
21.4%
9 2
14.3%
0 2
14.3%
7 1
 
7.1%
2 1
 
7.1%
8 1
 
7.1%
6 1
 
7.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 14
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 3
21.4%
1 3
21.4%
9 2
14.3%
0 2
14.3%
7 1
 
7.1%
2 1
 
7.1%
8 1
 
7.1%
6 1
 
7.1%

Most occurring scripts

ValueCountFrequency (%)
Common 14
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
3 3
21.4%
1 3
21.4%
9 2
14.3%
0 2
14.3%
7 1
 
7.1%
2 1
 
7.1%
8 1
 
7.1%
6 1
 
7.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 14
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3 3
21.4%
1 3
21.4%
9 2
14.3%
0 2
14.3%
7 1
 
7.1%
2 1
 
7.1%
8 1
 
7.1%
6 1
 
7.1%

nameAccordingTo
Text

Constant  Missing 

Distinct1
Distinct (%)50.0%
Missing2361471
Missing (%)> 99.9%
Memory size18.0 MiB
2025-01-08T17:46:46.204663image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters2
Distinct characters1
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row6
2nd row6
ValueCountFrequency (%)
6 2
100.0%
2025-01-08T17:46:46.300167image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6 2
100.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
6 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
6 2
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
6 2
100.0%

namePublishedIn
Text

Missing 

Distinct2
Distinct (%)66.7%
Missing2361470
Missing (%)> 99.9%
Memory size18.0 MiB
2025-01-08T17:46:46.345589image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length7
Mean length7.333333333
Min length7

Characters and Unicode

Total characters22
Distinct characters10
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)33.3%

Sample

1st row7707728
2nd rowACCEPTED
3rd row7707728
ValueCountFrequency (%)
7707728 2
66.7%
accepted 1
33.3%
2025-01-08T17:46:46.443060image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
7 8
36.4%
0 2
 
9.1%
2 2
 
9.1%
8 2
 
9.1%
C 2
 
9.1%
E 2
 
9.1%
A 1
 
4.5%
P 1
 
4.5%
T 1
 
4.5%
D 1
 
4.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 14
63.6%
Uppercase Letter 8
36.4%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
C 2
25.0%
E 2
25.0%
A 1
12.5%
P 1
12.5%
T 1
12.5%
D 1
12.5%
Decimal Number
ValueCountFrequency (%)
7 8
57.1%
0 2
 
14.3%
2 2
 
14.3%
8 2
 
14.3%

Most occurring scripts

ValueCountFrequency (%)
Common 14
63.6%
Latin 8
36.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
C 2
25.0%
E 2
25.0%
A 1
12.5%
P 1
12.5%
T 1
12.5%
D 1
12.5%
Common
ValueCountFrequency (%)
7 8
57.1%
0 2
 
14.3%
2 2
 
14.3%
8 2
 
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 22
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
7 8
36.4%
0 2
 
9.1%
2 2
 
9.1%
8 2
 
9.1%
C 2
 
9.1%
E 2
 
9.1%
A 1
 
4.5%
P 1
 
4.5%
T 1
 
4.5%
D 1
 
4.5%

namePublishedInYear
Text

Missing 

Distinct2
Distinct (%)66.7%
Missing2361470
Missing (%)> 99.9%
Memory size18.0 MiB
2025-01-08T17:46:46.487062image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length9
Median length3
Mean length5
Min length3

Characters and Unicode

Total characters15
Distinct characters9
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)33.3%

Sample

1st row220
2nd rowGuatteria
3rd row220
ValueCountFrequency (%)
220 2
66.7%
guatteria 1
33.3%
2025-01-08T17:46:46.589253image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 4
26.7%
0 2
13.3%
a 2
13.3%
t 2
13.3%
G 1
 
6.7%
u 1
 
6.7%
e 1
 
6.7%
r 1
 
6.7%
i 1
 
6.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 8
53.3%
Decimal Number 6
40.0%
Uppercase Letter 1
 
6.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 2
25.0%
t 2
25.0%
u 1
12.5%
e 1
12.5%
r 1
12.5%
i 1
12.5%
Decimal Number
ValueCountFrequency (%)
2 4
66.7%
0 2
33.3%
Uppercase Letter
ValueCountFrequency (%)
G 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 9
60.0%
Common 6
40.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 2
22.2%
t 2
22.2%
G 1
11.1%
u 1
11.1%
e 1
11.1%
r 1
11.1%
i 1
11.1%
Common
ValueCountFrequency (%)
2 4
66.7%
0 2
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 15
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 4
26.7%
0 2
13.3%
a 2
13.3%
t 2
13.3%
G 1
 
6.7%
u 1
 
6.7%
e 1
 
6.7%
r 1
 
6.7%
i 1
 
6.7%
Distinct9381
Distinct (%)0.4%
Missing5000
Missing (%)0.2%
Memory size18.0 MiB
2025-01-08T17:46:46.779241image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length164
Median length148
Mean length65.02881001
Min length3

Characters and Unicode

Total characters153238635
Distinct characters72
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1505 ?
Unique (%)0.1%

Sample

1st rowAnimalia, Arthropoda, Crustacea, Malacostraca, Eumalacostraca, Eucarida, Decapoda, Pleocyemata, Hippolytidae
2nd rowPlantae, Dicotyledonae, Brassicales, Brassicaceae, Brassicoideae
3rd rowAnimalia, Chordata, Vertebrata, Amphibia, Caudata, Plethodontidae
4th rowAnimalia, Cnidaria, Anthozoa, Hexacorallia, Scleractinia
5th rowAnimalia, Annelida, Polychaeta, Errantia, Eunicida, Lumbrineridae
ValueCountFrequency (%)
animalia 1209335
 
9.1%
plantae 1054356
 
7.9%
dicotyledonae 657170
 
4.9%
chordata 572776
 
4.3%
vertebrata 567549
 
4.3%
arthropoda 251879
 
1.9%
monocotyledonae 231105
 
1.7%
mollusca 220773
 
1.7%
poales 178488
 
1.3%
gastropoda 155944
 
1.2%
Other values (9606) 8232767
61.8%
2025-01-08T17:46:47.057391image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 21342566
13.9%
e 15326706
 
10.0%
i 11163562
 
7.3%
10975669
 
7.2%
, 10942165
 
7.1%
o 9722855
 
6.3%
t 8276865
 
5.4%
l 7536114
 
4.9%
r 7095165
 
4.6%
n 6565956
 
4.3%
Other values (62) 44291012
28.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 117973422
77.0%
Uppercase Letter 13298578
 
8.7%
Space Separator 10975669
 
7.2%
Other Punctuation 10947398
 
7.1%
Open Punctuation 21699
 
< 0.1%
Close Punctuation 21699
 
< 0.1%
Dash Punctuation 127
 
< 0.1%
Connector Punctuation 31
 
< 0.1%
Decimal Number 11
 
< 0.1%
Math Symbol 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 21342566
18.1%
e 15326706
13.0%
i 11163562
9.5%
o 9722855
8.2%
t 8276865
 
7.0%
l 7536114
 
6.4%
r 7095165
 
6.0%
n 6565956
 
5.6%
d 5401327
 
4.6%
c 5122421
 
4.3%
Other values (17) 20419885
17.3%
Uppercase Letter
ValueCountFrequency (%)
A 2756481
20.7%
P 2449943
18.4%
C 1607999
12.1%
M 1121374
8.4%
D 843351
 
6.3%
V 632813
 
4.8%
E 535529
 
4.0%
S 528149
 
4.0%
L 345229
 
2.6%
R 342094
 
2.6%
Other values (16) 2135616
16.1%
Decimal Number
ValueCountFrequency (%)
6 2
18.2%
9 2
18.2%
0 2
18.2%
2 2
18.2%
4 1
9.1%
1 1
9.1%
3 1
9.1%
Other Punctuation
ValueCountFrequency (%)
, 10942165
> 99.9%
. 5211
 
< 0.1%
? 16
 
< 0.1%
/ 6
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
( 21661
99.8%
[ 38
 
0.2%
Close Punctuation
ValueCountFrequency (%)
) 21661
99.8%
] 38
 
0.2%
Space Separator
ValueCountFrequency (%)
10975669
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 127
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 31
100.0%
Math Symbol
ValueCountFrequency (%)
+ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 131272000
85.7%
Common 21966635
 
14.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 21342566
16.3%
e 15326706
11.7%
i 11163562
 
8.5%
o 9722855
 
7.4%
t 8276865
 
6.3%
l 7536114
 
5.7%
r 7095165
 
5.4%
n 6565956
 
5.0%
d 5401327
 
4.1%
c 5122421
 
3.9%
Other values (43) 33718463
25.7%
Common
ValueCountFrequency (%)
10975669
50.0%
, 10942165
49.8%
( 21661
 
0.1%
) 21661
 
0.1%
. 5211
 
< 0.1%
- 127
 
< 0.1%
[ 38
 
< 0.1%
] 38
 
< 0.1%
_ 31
 
< 0.1%
? 16
 
< 0.1%
Other values (9) 18
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 153238470
> 99.9%
None 165
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 21342566
13.9%
e 15326706
 
10.0%
i 11163562
 
7.3%
10975669
 
7.2%
, 10942165
 
7.1%
o 9722855
 
6.3%
t 8276865
 
5.4%
l 7536114
 
4.9%
r 7095165
 
4.6%
n 6565956
 
4.3%
Other values (61) 44290847
28.9%
None
ValueCountFrequency (%)
ö 165
100.0%
Distinct10
Distinct (%)< 0.1%
Missing10
Missing (%)< 0.1%
Memory size18.0 MiB
2025-01-08T17:46:47.118392image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length36
Median length8
Mean length7.504671892
Min length4

Characters and Unicode

Total characters17722005
Distinct characters34
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)< 0.1%

Sample

1st rowAnimalia
2nd rowPlantae
3rd rowAnimalia
4th rowAnimalia
5th rowAnimalia
ValueCountFrequency (%)
animalia 1209386
51.1%
plantae 1054744
44.6%
fungi 56807
 
2.4%
chromista 20874
 
0.9%
bacteria 13612
 
0.6%
incertae 5762
 
0.2%
sedis 5762
 
0.2%
protozoa 275
 
< 0.1%
5399 1
 
< 0.1%
821cc27a-e3bb-4bc5-ac34-89ada245069d 1
 
< 0.1%
2025-01-08T17:46:47.221192image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 4582399
25.9%
i 2521589
14.2%
n 2326699
13.1%
l 2264130
12.8%
m 1230260
 
6.9%
A 1209386
 
6.8%
t 1095267
 
6.2%
e 1085643
 
6.1%
P 1055019
 
6.0%
u 56807
 
0.3%
Other values (24) 294806
 
1.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 15360515
86.7%
Uppercase Letter 2355698
 
13.3%
Space Separator 5762
 
< 0.1%
Decimal Number 26
 
< 0.1%
Dash Punctuation 4
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 4582399
29.8%
i 2521589
16.4%
n 2326699
15.1%
l 2264130
14.7%
m 1230260
 
8.0%
t 1095267
 
7.1%
e 1085643
 
7.1%
u 56807
 
0.4%
g 56807
 
0.4%
r 40523
 
0.3%
Other values (7) 100391
 
0.7%
Decimal Number
ValueCountFrequency (%)
3 4
15.4%
9 4
15.4%
5 3
11.5%
8 3
11.5%
2 3
11.5%
4 3
11.5%
6 3
11.5%
1 1
 
3.8%
7 1
 
3.8%
0 1
 
3.8%
Uppercase Letter
ValueCountFrequency (%)
A 1209386
51.3%
P 1055019
44.8%
F 56807
 
2.4%
C 20874
 
0.9%
B 13612
 
0.6%
Space Separator
ValueCountFrequency (%)
5762
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 17716213
> 99.9%
Common 5792
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 4582399
25.9%
i 2521589
14.2%
n 2326699
13.1%
l 2264130
12.8%
m 1230260
 
6.9%
A 1209386
 
6.8%
t 1095267
 
6.2%
e 1085643
 
6.1%
P 1055019
 
6.0%
u 56807
 
0.3%
Other values (12) 289014
 
1.6%
Common
ValueCountFrequency (%)
5762
99.5%
3 4
 
0.1%
9 4
 
0.1%
- 4
 
0.1%
5 3
 
0.1%
8 3
 
0.1%
2 3
 
0.1%
4 3
 
0.1%
6 3
 
0.1%
1 1
 
< 0.1%
Other values (2) 2
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 17722005
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 4582399
25.9%
i 2521589
14.2%
n 2326699
13.1%
l 2264130
12.8%
m 1230260
 
6.9%
A 1209386
 
6.8%
t 1095267
 
6.2%
e 1085643
 
6.1%
P 1055019
 
6.0%
u 56807
 
0.3%
Other values (24) 294806
 
1.7%

phylum
Text

Distinct64
Distinct (%)< 0.1%
Missing7896
Missing (%)0.3%
Memory size18.0 MiB
2025-01-08T17:46:47.282691image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length17
Median length16
Mean length10.11762054
Min length2

Characters and Unicode

Total characters23812599
Distinct characters47
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7 ?
Unique (%)< 0.1%

Sample

1st rowArthropoda
2nd rowTracheophyta
3rd rowChordata
4th rowCnidaria
5th rowAnnelida
ValueCountFrequency (%)
tracheophyta 965311
41.0%
chordata 572771
24.3%
arthropoda 252406
 
10.7%
mollusca 220179
 
9.4%
annelida 61416
 
2.6%
ascomycota 56083
 
2.4%
bryophyta 37922
 
1.6%
rhodophyta 30954
 
1.3%
cnidaria 29998
 
1.3%
echinodermata 23220
 
1.0%
Other values (54) 103317
 
4.4%
2025-01-08T17:46:47.406682image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 4009610
16.8%
h 2984383
12.5%
o 2602587
10.9%
r 2209125
9.3%
t 2042901
8.6%
c 1367427
 
5.7%
p 1328235
 
5.6%
y 1197926
 
5.0%
e 1120639
 
4.7%
d 989852
 
4.2%
Other values (37) 3959914
16.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 21459009
90.1%
Uppercase Letter 2353576
 
9.9%
Decimal Number 14
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 4009610
18.7%
h 2984383
13.9%
o 2602587
12.1%
r 2209125
10.3%
t 2042901
9.5%
c 1367427
 
6.4%
p 1328235
 
6.2%
y 1197926
 
5.6%
e 1120639
 
5.2%
d 989852
 
4.6%
Other values (10) 1606324
7.5%
Uppercase Letter
ValueCountFrequency (%)
T 965446
41.0%
C 629275
26.7%
A 371272
 
15.8%
M 229756
 
9.8%
B 40910
 
1.7%
R 31169
 
1.3%
E 23307
 
1.0%
P 20411
 
0.9%
N 19086
 
0.8%
O 17979
 
0.8%
Other values (9) 4965
 
0.2%
Decimal Number
ValueCountFrequency (%)
1 3
21.4%
8 3
21.4%
3 2
14.3%
5 2
14.3%
9 1
 
7.1%
0 1
 
7.1%
7 1
 
7.1%
2 1
 
7.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 23812585
> 99.9%
Common 14
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 4009610
16.8%
h 2984383
12.5%
o 2602587
10.9%
r 2209125
9.3%
t 2042901
8.6%
c 1367427
 
5.7%
p 1328235
 
5.6%
y 1197926
 
5.0%
e 1120639
 
4.7%
d 989852
 
4.2%
Other values (29) 3959900
16.6%
Common
ValueCountFrequency (%)
1 3
21.4%
8 3
21.4%
3 2
14.3%
5 2
14.3%
9 1
 
7.1%
0 1
 
7.1%
7 1
 
7.1%
2 1
 
7.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 23812599
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 4009610
16.8%
h 2984383
12.5%
o 2602587
10.9%
r 2209125
9.3%
t 2042901
8.6%
c 1367427
 
5.7%
p 1328235
 
5.6%
y 1197926
 
5.0%
e 1120639
 
4.7%
d 989852
 
4.2%
Other values (37) 3959914
16.6%

class
Text

Missing 

Distinct186
Distinct (%)< 0.1%
Missing138563
Missing (%)5.9%
Memory size18.0 MiB
2025-01-08T17:46:47.542175image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length24
Median length20
Mean length10.43315114
Min length4

Characters and Unicode

Total characters23191956
Distinct characters59
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique20 ?
Unique (%)< 0.1%

Sample

1st rowMalacostraca
2nd rowMagnoliopsida
3rd rowAmphibia
4th rowAnthozoa
5th rowPolychaeta
ValueCountFrequency (%)
magnoliopsida 657370
29.6%
liliopsida 231154
 
10.4%
gastropoda 155259
 
7.0%
mammalia 152953
 
6.9%
insecta 149742
 
6.7%
aves 149231
 
6.7%
amphibia 100689
 
4.5%
malacostraca 76525
 
3.4%
polypodiopsida 63916
 
2.9%
polychaeta 53619
 
2.4%
Other values (176) 432452
19.5%
2025-01-08T17:46:47.746103image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 3739588
16.1%
i 2845281
12.3%
o 2681344
11.6%
s 1638014
 
7.1%
p 1464517
 
6.3%
l 1407397
 
6.1%
d 1364592
 
5.9%
n 963712
 
4.2%
M 891567
 
3.8%
e 817658
 
3.5%
Other values (49) 5378286
23.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 20969024
90.4%
Uppercase Letter 2222910
 
9.6%
Decimal Number 17
 
< 0.1%
Other Punctuation 3
 
< 0.1%
Dash Punctuation 2
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 3739588
17.8%
i 2845281
13.6%
o 2681344
12.8%
s 1638014
7.8%
p 1464517
 
7.0%
l 1407397
 
6.7%
d 1364592
 
6.5%
n 963712
 
4.6%
e 817658
 
3.9%
g 676609
 
3.2%
Other values (15) 3370312
16.1%
Uppercase Letter
ValueCountFrequency (%)
M 891567
40.1%
L 289993
 
13.0%
A 289952
 
13.0%
G 157763
 
7.1%
I 149750
 
6.7%
P 138048
 
6.2%
B 98443
 
4.4%
C 59514
 
2.7%
S 49464
 
2.2%
F 30189
 
1.4%
Other values (12) 68227
 
3.1%
Decimal Number
ValueCountFrequency (%)
2 4
23.5%
1 4
23.5%
0 2
11.8%
9 2
11.8%
4 1
 
5.9%
3 1
 
5.9%
5 1
 
5.9%
7 1
 
5.9%
6 1
 
5.9%
Other Punctuation
ValueCountFrequency (%)
: 2
66.7%
. 1
33.3%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 23191934
> 99.9%
Common 22
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 3739588
16.1%
i 2845281
12.3%
o 2681344
11.6%
s 1638014
 
7.1%
p 1464517
 
6.3%
l 1407397
 
6.1%
d 1364592
 
5.9%
n 963712
 
4.2%
M 891567
 
3.8%
e 817658
 
3.5%
Other values (37) 5378264
23.2%
Common
ValueCountFrequency (%)
2 4
18.2%
1 4
18.2%
0 2
9.1%
- 2
9.1%
: 2
9.1%
9 2
9.1%
4 1
 
4.5%
3 1
 
4.5%
5 1
 
4.5%
7 1
 
4.5%
Other values (2) 2
9.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 23191956
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 3739588
16.1%
i 2845281
12.3%
o 2681344
11.6%
s 1638014
 
7.1%
p 1464517
 
6.3%
l 1407397
 
6.1%
d 1364592
 
5.9%
n 963712
 
4.2%
M 891567
 
3.8%
e 817658
 
3.5%
Other values (49) 5378286
23.2%

order
Text

Missing 

Distinct926
Distinct (%)< 0.1%
Missing145729
Missing (%)6.2%
Memory size18.0 MiB
2025-01-08T17:46:47.922448image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length22
Median length19
Mean length9.927911798
Min length5

Characters and Unicode

Total characters21997711
Distinct characters56
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique75 ?
Unique (%)< 0.1%

Sample

1st rowDecapoda
2nd rowBrassicales
3rd rowCaudata
4th rowScleractinia
5th rowEunicida
ValueCountFrequency (%)
poales 178531
 
8.1%
asterales 96944
 
4.4%
passeriformes 94751
 
4.3%
rodentia 75757
 
3.4%
lamiales 67866
 
3.1%
fabales 64632
 
2.9%
caudata 60565
 
2.7%
perciformes 54527
 
2.5%
malpighiales 53482
 
2.4%
decapoda 49962
 
2.3%
Other values (916) 1418727
64.0%
2025-01-08T17:46:48.164944image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 3200606
14.5%
e 2456217
11.2%
s 1919697
 
8.7%
l 1767329
 
8.0%
o 1673369
 
7.6%
i 1557251
 
7.1%
r 1440556
 
6.5%
t 919616
 
4.2%
p 728104
 
3.3%
n 715516
 
3.3%
Other values (46) 5619450
25.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 19781961
89.9%
Uppercase Letter 2215743
 
10.1%
Decimal Number 7
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 3200606
16.2%
e 2456217
12.4%
s 1919697
9.7%
l 1767329
8.9%
o 1673369
8.5%
i 1557251
7.9%
r 1440556
7.3%
t 919616
 
4.6%
p 728104
 
3.7%
n 715516
 
3.6%
Other values (16) 3403700
17.2%
Uppercase Letter
ValueCountFrequency (%)
P 469423
21.2%
C 313394
14.1%
A 244366
11.0%
L 185525
 
8.4%
S 156732
 
7.1%
M 140519
 
6.3%
R 133204
 
6.0%
D 93481
 
4.2%
F 78521
 
3.5%
H 72334
 
3.3%
Other values (14) 328244
14.8%
Decimal Number
ValueCountFrequency (%)
3 2
28.6%
8 1
14.3%
6 1
14.3%
9 1
14.3%
0 1
14.3%
1 1
14.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 21997704
> 99.9%
Common 7
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 3200606
14.5%
e 2456217
11.2%
s 1919697
 
8.7%
l 1767329
 
8.0%
o 1673369
 
7.6%
i 1557251
 
7.1%
r 1440556
 
6.5%
t 919616
 
4.2%
p 728104
 
3.3%
n 715516
 
3.3%
Other values (40) 5619443
25.5%
Common
ValueCountFrequency (%)
3 2
28.6%
8 1
14.3%
6 1
14.3%
9 1
14.3%
0 1
14.3%
1 1
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 21997711
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 3200606
14.5%
e 2456217
11.2%
s 1919697
 
8.7%
l 1767329
 
8.0%
o 1673369
 
7.6%
i 1557251
 
7.1%
r 1440556
 
6.5%
t 919616
 
4.2%
p 728104
 
3.3%
n 715516
 
3.3%
Other values (46) 5619450
25.5%

superfamily
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing2361471
Missing (%)> 99.9%
Memory size18.0 MiB
2025-01-08T17:46:48.217578image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length16
Median length11.5
Mean length11.5
Min length7

Characters and Unicode

Total characters23
Distinct characters13
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st row3034046
2nd rowMiconia coronata
ValueCountFrequency (%)
3034046 1
33.3%
miconia 1
33.3%
coronata 1
33.3%
2025-01-08T17:46:48.311785image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
o 3
13.0%
a 3
13.0%
3 2
8.7%
0 2
8.7%
4 2
8.7%
i 2
8.7%
c 2
8.7%
n 2
8.7%
6 1
 
4.3%
M 1
 
4.3%
Other values (3) 3
13.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 14
60.9%
Decimal Number 7
30.4%
Uppercase Letter 1
 
4.3%
Space Separator 1
 
4.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 3
21.4%
a 3
21.4%
i 2
14.3%
c 2
14.3%
n 2
14.3%
r 1
 
7.1%
t 1
 
7.1%
Decimal Number
ValueCountFrequency (%)
3 2
28.6%
0 2
28.6%
4 2
28.6%
6 1
14.3%
Uppercase Letter
ValueCountFrequency (%)
M 1
100.0%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 15
65.2%
Common 8
34.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 3
20.0%
a 3
20.0%
i 2
13.3%
c 2
13.3%
n 2
13.3%
M 1
 
6.7%
r 1
 
6.7%
t 1
 
6.7%
Common
ValueCountFrequency (%)
3 2
25.0%
0 2
25.0%
4 2
25.0%
6 1
12.5%
1
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 23
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 3
13.0%
a 3
13.0%
3 2
8.7%
0 2
8.7%
4 2
8.7%
i 2
8.7%
c 2
8.7%
n 2
8.7%
6 1
 
4.3%
M 1
 
4.3%
Other values (3) 3
13.0%

family
Text

Missing 

Distinct6622
Distinct (%)0.3%
Missing52497
Missing (%)2.2%
Memory size18.0 MiB
2025-01-08T17:46:48.458055image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length29
Median length21
Mean length10.85806089
Min length6

Characters and Unicode

Total characters25071002
Distinct characters57
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique722 ?
Unique (%)< 0.1%

Sample

1st rowHippolytidae
2nd rowBrassicaceae
3rd rowPlethodontidae
4th rowLumbrineridae
5th rowGomphidae
ValueCountFrequency (%)
poaceae 128004
 
5.5%
asteraceae 91253
 
4.0%
fabaceae 60425
 
2.6%
plethodontidae 56509
 
2.4%
cyperaceae 35190
 
1.5%
rubiaceae 30478
 
1.3%
cricetidae 27411
 
1.2%
muridae 23714
 
1.0%
apidae 20894
 
0.9%
melastomataceae 18664
 
0.8%
Other values (6616) 1816438
78.7%
2025-01-08T17:46:48.685388image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 4499005
17.9%
e 4442765
17.7%
i 2325126
9.3%
c 1720240
 
6.9%
d 1491888
 
6.0%
o 1265964
 
5.0%
r 1175408
 
4.7%
l 1016477
 
4.1%
t 859959
 
3.4%
n 822238
 
3.3%
Other values (47) 5451932
21.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 22761967
90.8%
Uppercase Letter 2309006
 
9.2%
Connector Punctuation 20
 
< 0.1%
Space Separator 4
 
< 0.1%
Other Punctuation 3
 
< 0.1%
Open Punctuation 1
 
< 0.1%
Close Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 4499005
19.8%
e 4442765
19.5%
i 2325126
10.2%
c 1720240
 
7.6%
d 1491888
 
6.6%
o 1265964
 
5.6%
r 1175408
 
5.2%
l 1016477
 
4.5%
t 859959
 
3.8%
n 822238
 
3.6%
Other values (16) 3142897
13.8%
Uppercase Letter
ValueCountFrequency (%)
P 464221
20.1%
C 329853
14.3%
A 269139
11.7%
S 164131
 
7.1%
M 163856
 
7.1%
L 109923
 
4.8%
R 92205
 
4.0%
F 87633
 
3.8%
T 85890
 
3.7%
B 75520
 
3.3%
Other values (16) 466635
20.2%
Connector Punctuation
ValueCountFrequency (%)
_ 20
100.0%
Space Separator
ValueCountFrequency (%)
4
100.0%
Other Punctuation
ValueCountFrequency (%)
. 3
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 25070973
> 99.9%
Common 29
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 4499005
17.9%
e 4442765
17.7%
i 2325126
9.3%
c 1720240
 
6.9%
d 1491888
 
6.0%
o 1265964
 
5.0%
r 1175408
 
4.7%
l 1016477
 
4.1%
t 859959
 
3.4%
n 822238
 
3.3%
Other values (42) 5451903
21.7%
Common
ValueCountFrequency (%)
_ 20
69.0%
4
 
13.8%
. 3
 
10.3%
( 1
 
3.4%
) 1
 
3.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 25071002
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 4499005
17.9%
e 4442765
17.7%
i 2325126
9.3%
c 1720240
 
6.9%
d 1491888
 
6.0%
o 1265964
 
5.0%
r 1175408
 
4.7%
l 1016477
 
4.1%
t 859959
 
3.4%
n 822238
 
3.3%
Other values (47) 5451932
21.7%

subfamily
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing2361471
Missing (%)> 99.9%
Memory size18.0 MiB
2025-01-08T17:46:48.740730image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length16
Median length13.5
Mean length13.5
Min length11

Characters and Unicode

Total characters27
Distinct characters14
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st rowDrosera sp.
2nd rowMiconia coronata
ValueCountFrequency (%)
drosera 1
25.0%
sp 1
25.0%
miconia 1
25.0%
coronata 1
25.0%
2025-01-08T17:46:48.836698image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
o 4
14.8%
a 4
14.8%
r 3
11.1%
s 2
7.4%
2
7.4%
i 2
7.4%
c 2
7.4%
n 2
7.4%
D 1
 
3.7%
e 1
 
3.7%
Other values (4) 4
14.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 22
81.5%
Space Separator 2
 
7.4%
Uppercase Letter 2
 
7.4%
Other Punctuation 1
 
3.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 4
18.2%
a 4
18.2%
r 3
13.6%
s 2
9.1%
i 2
9.1%
c 2
9.1%
n 2
9.1%
e 1
 
4.5%
p 1
 
4.5%
t 1
 
4.5%
Uppercase Letter
ValueCountFrequency (%)
D 1
50.0%
M 1
50.0%
Space Separator
ValueCountFrequency (%)
2
100.0%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 24
88.9%
Common 3
 
11.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 4
16.7%
a 4
16.7%
r 3
12.5%
s 2
8.3%
i 2
8.3%
c 2
8.3%
n 2
8.3%
D 1
 
4.2%
e 1
 
4.2%
p 1
 
4.2%
Other values (2) 2
8.3%
Common
ValueCountFrequency (%)
2
66.7%
. 1
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 27
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 4
14.8%
a 4
14.8%
r 3
11.1%
s 2
7.4%
2
7.4%
i 2
7.4%
c 2
7.4%
n 2
7.4%
D 1
 
3.7%
e 1
 
3.7%
Other values (4) 4
14.8%

subtribe
Text

Missing 

Distinct2
Distinct (%)66.7%
Missing2361470
Missing (%)> 99.9%
Memory size18.0 MiB
2025-01-08T17:46:48.895037image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length77
Median length3
Mean length27.66666667
Min length3

Characters and Unicode

Total characters83
Distinct characters21
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)33.3%

Sample

1st rowEML
2nd rowOCCURRENCE_STATUS_INFERRED_FROM_INDIVIDUAL_COUNT;GEODETIC_DATUM_ASSUMED_WGS84
3rd rowEML
ValueCountFrequency (%)
eml 2
66.7%
occurrence_status_inferred_from_individual_count;geodetic_datum_assumed_wgs84 1
33.3%
2025-01-08T17:46:49.009813image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
E 9
 
10.8%
_ 8
 
9.6%
D 6
 
7.2%
U 6
 
7.2%
I 5
 
6.0%
M 5
 
6.0%
S 5
 
6.0%
T 5
 
6.0%
R 5
 
6.0%
C 5
 
6.0%
Other values (11) 24
28.9%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 72
86.7%
Connector Punctuation 8
 
9.6%
Decimal Number 2
 
2.4%
Other Punctuation 1
 
1.2%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 9
12.5%
D 6
 
8.3%
U 6
 
8.3%
I 5
 
6.9%
M 5
 
6.9%
S 5
 
6.9%
T 5
 
6.9%
R 5
 
6.9%
C 5
 
6.9%
N 4
 
5.6%
Other values (7) 17
23.6%
Decimal Number
ValueCountFrequency (%)
8 1
50.0%
4 1
50.0%
Connector Punctuation
ValueCountFrequency (%)
_ 8
100.0%
Other Punctuation
ValueCountFrequency (%)
; 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 72
86.7%
Common 11
 
13.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 9
12.5%
D 6
 
8.3%
U 6
 
8.3%
I 5
 
6.9%
M 5
 
6.9%
S 5
 
6.9%
T 5
 
6.9%
R 5
 
6.9%
C 5
 
6.9%
N 4
 
5.6%
Other values (7) 17
23.6%
Common
ValueCountFrequency (%)
_ 8
72.7%
; 1
 
9.1%
8 1
 
9.1%
4 1
 
9.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 83
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E 9
 
10.8%
_ 8
 
9.6%
D 6
 
7.2%
U 6
 
7.2%
I 5
 
6.0%
M 5
 
6.0%
S 5
 
6.0%
T 5
 
6.0%
R 5
 
6.0%
C 5
 
6.0%
Other values (11) 24
28.9%

genus
Text

Missing 

Distinct58510
Distinct (%)2.6%
Missing120652
Missing (%)5.1%
Memory size18.0 MiB
2025-01-08T17:46:49.197054image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length24
Median length21
Mean length9.034970665
Min length2

Characters and Unicode

Total characters20245752
Distinct characters64
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique16466 ?
Unique (%)0.7%

Sample

1st rowPaysonia
2nd rowDesmognathus
3rd rowNinoe
4th rowHylogomphus
5th rowSkrjabinoclava
ValueCountFrequency (%)
plethodon 42953
 
1.9%
bombus 15824
 
0.7%
carex 14686
 
0.7%
miconia 10093
 
0.5%
peromyscus 10025
 
0.4%
desmognathus 9258
 
0.4%
cladonia 7917
 
0.4%
poa 7658
 
0.3%
cyperus 7007
 
0.3%
paspalum 6575
 
0.3%
Other values (58499) 2108825
94.1%
2025-01-08T17:46:49.461973image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 2232664
 
11.0%
i 1675014
 
8.3%
o 1648843
 
8.1%
e 1402186
 
6.9%
s 1326799
 
6.6%
r 1282975
 
6.3%
l 1123780
 
5.6%
u 1022138
 
5.0%
n 994213
 
4.9%
t 952492
 
4.7%
Other values (54) 6584648
32.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 18004558
88.9%
Uppercase Letter 2240850
 
11.1%
Dash Punctuation 304
 
< 0.1%
Decimal Number 34
 
< 0.1%
Other Punctuation 6
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 2232664
12.4%
i 1675014
 
9.3%
o 1648843
 
9.2%
e 1402186
 
7.8%
s 1326799
 
7.4%
r 1282975
 
7.1%
l 1123780
 
6.2%
u 1022138
 
5.7%
n 994213
 
5.5%
t 952492
 
5.3%
Other values (16) 4343454
24.1%
Uppercase Letter
ValueCountFrequency (%)
P 337157
15.0%
C 290480
13.0%
S 207209
9.2%
A 204837
9.1%
M 154295
 
6.9%
E 121845
 
5.4%
L 119146
 
5.3%
T 103927
 
4.6%
D 102293
 
4.6%
B 93056
 
4.2%
Other values (16) 506605
22.6%
Decimal Number
ValueCountFrequency (%)
2 9
26.5%
0 7
20.6%
5 5
14.7%
1 4
11.8%
4 3
 
8.8%
3 2
 
5.9%
7 2
 
5.9%
8 1
 
2.9%
9 1
 
2.9%
Other Punctuation
ValueCountFrequency (%)
: 4
66.7%
. 2
33.3%
Dash Punctuation
ValueCountFrequency (%)
- 304
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 20245408
> 99.9%
Common 344
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 2232664
 
11.0%
i 1675014
 
8.3%
o 1648843
 
8.1%
e 1402186
 
6.9%
s 1326799
 
6.6%
r 1282975
 
6.3%
l 1123780
 
5.6%
u 1022138
 
5.0%
n 994213
 
4.9%
t 952492
 
4.7%
Other values (42) 6584304
32.5%
Common
ValueCountFrequency (%)
- 304
88.4%
2 9
 
2.6%
0 7
 
2.0%
5 5
 
1.5%
1 4
 
1.2%
: 4
 
1.2%
4 3
 
0.9%
3 2
 
0.6%
7 2
 
0.6%
. 2
 
0.6%
Other values (2) 2
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 20245752
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 2232664
 
11.0%
i 1675014
 
8.3%
o 1648843
 
8.1%
e 1402186
 
6.9%
s 1326799
 
6.6%
r 1282975
 
6.3%
l 1123780
 
5.6%
u 1022138
 
5.0%
n 994213
 
4.9%
t 952492
 
4.7%
Other values (54) 6584648
32.5%

genericName
Text

Missing 

Distinct60031
Distinct (%)2.7%
Missing120743
Missing (%)5.1%
Memory size18.0 MiB
2025-01-08T17:46:49.663695image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length24
Median length21
Mean length8.952593128
Min length1

Characters and Unicode

Total characters20060344
Distinct characters66
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique18598 ?
Unique (%)0.8%

Sample

1st rowLesquerella
2nd rowDesmognathus
3rd rowNinoe
4th rowGomphus
5th rowSkrjabinoclava
ValueCountFrequency (%)
plethodon 42953
 
1.9%
bombus 15821
 
0.7%
carex 14678
 
0.7%
peromyscus 10025
 
0.4%
desmognathus 9258
 
0.4%
poa 7661
 
0.3%
cyperus 6995
 
0.3%
cladonia 6779
 
0.3%
paspalum 6559
 
0.3%
solanum 6347
 
0.3%
Other values (60020) 2113656
94.3%
2025-01-08T17:46:49.922000image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 2206605
 
11.0%
i 1657025
 
8.3%
o 1617618
 
8.1%
e 1383725
 
6.9%
s 1315600
 
6.6%
r 1283428
 
6.4%
l 1102712
 
5.5%
u 1023714
 
5.1%
n 983410
 
4.9%
t 942000
 
4.7%
Other values (56) 6544507
32.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 17819535
88.8%
Uppercase Letter 2240735
 
11.2%
Decimal Number 34
 
< 0.1%
Dash Punctuation 30
 
< 0.1%
Other Punctuation 8
 
< 0.1%
Space Separator 2
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 2206605
12.4%
i 1657025
 
9.3%
o 1617618
 
9.1%
e 1383725
 
7.8%
s 1315600
 
7.4%
r 1283428
 
7.2%
l 1102712
 
6.2%
u 1023714
 
5.7%
n 983410
 
5.5%
t 942000
 
5.3%
Other values (18) 4303698
24.2%
Uppercase Letter
ValueCountFrequency (%)
P 334484
14.9%
C 299923
13.4%
A 207393
9.3%
S 200915
 
9.0%
M 150104
 
6.7%
L 120892
 
5.4%
E 116992
 
5.2%
T 108267
 
4.8%
D 102239
 
4.6%
B 94126
 
4.2%
Other values (16) 505400
22.6%
Decimal Number
ValueCountFrequency (%)
2 10
29.4%
1 8
23.5%
4 6
17.6%
0 4
 
11.8%
8 2
 
5.9%
3 2
 
5.9%
6 2
 
5.9%
Other Punctuation
ValueCountFrequency (%)
: 4
50.0%
. 3
37.5%
? 1
 
12.5%
Dash Punctuation
ValueCountFrequency (%)
- 30
100.0%
Space Separator
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 20060270
> 99.9%
Common 74
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 2206605
 
11.0%
i 1657025
 
8.3%
o 1617618
 
8.1%
e 1383725
 
6.9%
s 1315600
 
6.6%
r 1283428
 
6.4%
l 1102712
 
5.5%
u 1023714
 
5.1%
n 983410
 
4.9%
t 942000
 
4.7%
Other values (44) 6544433
32.6%
Common
ValueCountFrequency (%)
- 30
40.5%
2 10
 
13.5%
1 8
 
10.8%
4 6
 
8.1%
0 4
 
5.4%
: 4
 
5.4%
. 3
 
4.1%
8 2
 
2.7%
3 2
 
2.7%
6 2
 
2.7%
Other values (2) 3
 
4.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 20060323
> 99.9%
None 21
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 2206605
 
11.0%
i 1657025
 
8.3%
o 1617618
 
8.1%
e 1383725
 
6.9%
s 1315600
 
6.6%
r 1283428
 
6.4%
l 1102712
 
5.5%
u 1023714
 
5.1%
n 983410
 
4.9%
t 942000
 
4.7%
Other values (54) 6544486
32.6%
None
ValueCountFrequency (%)
ë 20
95.2%
ö 1
 
4.8%

subgenus
Text

Missing 

Distinct2
Distinct (%)66.7%
Missing2361470
Missing (%)> 99.9%
Memory size18.0 MiB
2025-01-08T17:46:49.978390image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length5
Median length4
Mean length4.333333333
Min length4

Characters and Unicode

Total characters13
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)33.3%

Sample

1st rowtrue
2nd rowfalse
3rd rowtrue
ValueCountFrequency (%)
true 2
66.7%
false 1
33.3%
2025-01-08T17:46:50.072270image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 3
23.1%
t 2
15.4%
r 2
15.4%
u 2
15.4%
f 1
 
7.7%
a 1
 
7.7%
l 1
 
7.7%
s 1
 
7.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 13
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 3
23.1%
t 2
15.4%
r 2
15.4%
u 2
15.4%
f 1
 
7.7%
a 1
 
7.7%
l 1
 
7.7%
s 1
 
7.7%

Most occurring scripts

ValueCountFrequency (%)
Latin 13
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 3
23.1%
t 2
15.4%
r 2
15.4%
u 2
15.4%
f 1
 
7.7%
a 1
 
7.7%
l 1
 
7.7%
s 1
 
7.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 13
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 3
23.1%
t 2
15.4%
r 2
15.4%
u 2
15.4%
f 1
 
7.7%
a 1
 
7.7%
l 1
 
7.7%
s 1
 
7.7%

infragenericEpithet
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing2361471
Missing (%)> 99.9%
Memory size18.0 MiB
2025-01-08T17:46:50.125639image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length36
Median length21.5
Mean length21.5
Min length7

Characters and Unicode

Total characters43
Distinct characters16
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st row5410907
2nd row821cc27a-e3bb-4bc5-ac34-89ada245069d
ValueCountFrequency (%)
5410907 1
50.0%
821cc27a-e3bb-4bc5-ac34-89ada245069d 1
50.0%
2025-01-08T17:46:50.226113image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4 4
 
9.3%
c 4
 
9.3%
a 4
 
9.3%
- 4
 
9.3%
5 3
 
7.0%
0 3
 
7.0%
9 3
 
7.0%
2 3
 
7.0%
b 3
 
7.0%
1 2
 
4.7%
Other values (6) 10
23.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 25
58.1%
Lowercase Letter 14
32.6%
Dash Punctuation 4
 
9.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
4 4
16.0%
5 3
12.0%
0 3
12.0%
9 3
12.0%
2 3
12.0%
1 2
8.0%
7 2
8.0%
8 2
8.0%
3 2
8.0%
6 1
 
4.0%
Lowercase Letter
ValueCountFrequency (%)
c 4
28.6%
a 4
28.6%
b 3
21.4%
d 2
14.3%
e 1
 
7.1%
Dash Punctuation
ValueCountFrequency (%)
- 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 29
67.4%
Latin 14
32.6%

Most frequent character per script

Common
ValueCountFrequency (%)
4 4
13.8%
- 4
13.8%
5 3
10.3%
0 3
10.3%
9 3
10.3%
2 3
10.3%
1 2
6.9%
7 2
6.9%
8 2
6.9%
3 2
6.9%
Latin
ValueCountFrequency (%)
c 4
28.6%
a 4
28.6%
b 3
21.4%
d 2
14.3%
e 1
 
7.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 43
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4 4
 
9.3%
c 4
 
9.3%
a 4
 
9.3%
- 4
 
9.3%
5 3
 
7.0%
0 3
 
7.0%
9 3
 
7.0%
2 3
 
7.0%
b 3
 
7.0%
1 2
 
4.7%
Other values (6) 10
23.3%

specificEpithet
Text

Missing 

Distinct101231
Distinct (%)4.9%
Missing306545
Missing (%)13.0%
Memory size18.0 MiB
2025-01-08T17:46:50.401410image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length25
Median length20
Mean length8.923929208
Min length2

Characters and Unicode

Total characters18338032
Distinct characters41
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique40686 ?
Unique (%)2.0%

Sample

1st rowlescurii
2nd rowochrophaeus
3rd rowkinbergi
4th rowadelphus
5th rowcouchii
ValueCountFrequency (%)
cinereus 20993
 
1.0%
americana 5520
 
0.3%
gracilis 5231
 
0.3%
canadensis 4690
 
0.2%
maniculatus 4077
 
0.2%
fuscus 4025
 
0.2%
occidentalis 3909
 
0.2%
montanus 3857
 
0.2%
elegans 3772
 
0.2%
carolinensis 3302
 
0.2%
Other values (101221) 1995552
97.1%
2025-01-08T17:46:50.659781image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 2365613
12.9%
i 2078357
11.3%
s 1544823
 
8.4%
e 1334633
 
7.3%
r 1236799
 
6.7%
u 1199119
 
6.5%
n 1159867
 
6.3%
l 1147225
 
6.3%
t 1010250
 
5.5%
o 1001337
 
5.5%
Other values (31) 4260009
23.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 18333151
> 99.9%
Dash Punctuation 4870
 
< 0.1%
Decimal Number 9
 
< 0.1%
Uppercase Letter 2
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 2365613
12.9%
i 2078357
11.3%
s 1544823
 
8.4%
e 1334633
 
7.3%
r 1236799
 
6.7%
u 1199119
 
6.5%
n 1159867
 
6.3%
l 1147225
 
6.3%
t 1010250
 
5.5%
o 1001337
 
5.5%
Other values (21) 4255128
23.2%
Decimal Number
ValueCountFrequency (%)
1 2
22.2%
0 2
22.2%
3 1
11.1%
5 1
11.1%
4 1
11.1%
9 1
11.1%
7 1
11.1%
Uppercase Letter
ValueCountFrequency (%)
U 1
50.0%
S 1
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 4870
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 18333153
> 99.9%
Common 4879
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 2365613
12.9%
i 2078357
11.3%
s 1544823
 
8.4%
e 1334633
 
7.3%
r 1236799
 
6.7%
u 1199119
 
6.5%
n 1159867
 
6.3%
l 1147225
 
6.3%
t 1010250
 
5.5%
o 1001337
 
5.5%
Other values (23) 4255130
23.2%
Common
ValueCountFrequency (%)
- 4870
99.8%
1 2
 
< 0.1%
0 2
 
< 0.1%
3 1
 
< 0.1%
5 1
 
< 0.1%
4 1
 
< 0.1%
9 1
 
< 0.1%
7 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 18337869
> 99.9%
None 163
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 2365613
12.9%
i 2078357
11.3%
s 1544823
 
8.4%
e 1334633
 
7.3%
r 1236799
 
6.7%
u 1199119
 
6.5%
n 1159867
 
6.3%
l 1147225
 
6.3%
t 1010250
 
5.5%
o 1001337
 
5.5%
Other values (26) 4259846
23.2%
None
ValueCountFrequency (%)
ü 95
58.3%
ö 31
 
19.0%
ï 18
 
11.0%
ë 18
 
11.0%
ä 1
 
0.6%

infraspecificEpithet
Text

Missing 

Distinct16294
Distinct (%)7.3%
Missing2138642
Missing (%)90.6%
Memory size18.0 MiB
2025-01-08T17:46:50.855995image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length25
Median length19
Mean length8.952681629
Min length1

Characters and Unicode

Total characters1994935
Distinct characters40
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5417 ?
Unique (%)2.4%

Sample

1st rowcinnamomina
2nd rowberlandieri
3rd rowmellodora
4th rowrubiginosa
5th rowspergulariiforme
ValueCountFrequency (%)
domesticus 1270
 
0.6%
acuminatum 1170
 
0.5%
pennsylvanicus 1114
 
0.5%
cinereus 977
 
0.4%
talpoides 972
 
0.4%
carolinensis 825
 
0.4%
occidentalis 737
 
0.3%
mexicana 726
 
0.3%
major 669
 
0.3%
borealis 646
 
0.3%
Other values (16284) 213725
95.9%
2025-01-08T17:46:51.113471image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 236092
11.8%
i 231855
11.6%
s 193161
9.7%
e 150019
 
7.5%
n 135896
 
6.8%
r 129914
 
6.5%
u 129816
 
6.5%
l 121955
 
6.1%
o 110496
 
5.5%
c 101725
 
5.1%
Other values (30) 454006
22.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1994754
> 99.9%
Dash Punctuation 158
 
< 0.1%
Decimal Number 18
 
< 0.1%
Other Punctuation 3
 
< 0.1%
Uppercase Letter 2
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 236092
11.8%
i 231855
11.6%
s 193161
9.7%
e 150019
 
7.5%
n 135896
 
6.8%
r 129914
 
6.5%
u 129816
 
6.5%
l 121955
 
6.1%
o 110496
 
5.5%
c 101725
 
5.1%
Other values (17) 453825
22.8%
Decimal Number
ValueCountFrequency (%)
2 4
22.2%
0 4
22.2%
1 4
22.2%
6 2
11.1%
4 1
 
5.6%
3 1
 
5.6%
5 1
 
5.6%
9 1
 
5.6%
Other Punctuation
ValueCountFrequency (%)
: 2
66.7%
. 1
33.3%
Uppercase Letter
ValueCountFrequency (%)
T 1
50.0%
Z 1
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 158
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1994756
> 99.9%
Common 179
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 236092
11.8%
i 231855
11.6%
s 193161
9.7%
e 150019
 
7.5%
n 135896
 
6.8%
r 129914
 
6.5%
u 129816
 
6.5%
l 121955
 
6.1%
o 110496
 
5.5%
c 101725
 
5.1%
Other values (19) 453827
22.8%
Common
ValueCountFrequency (%)
- 158
88.3%
2 4
 
2.2%
0 4
 
2.2%
1 4
 
2.2%
: 2
 
1.1%
6 2
 
1.1%
4 1
 
0.6%
3 1
 
0.6%
5 1
 
0.6%
9 1
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1994927
> 99.9%
None 8
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 236092
11.8%
i 231855
11.6%
s 193161
9.7%
e 150019
 
7.5%
n 135896
 
6.8%
r 129914
 
6.5%
u 129816
 
6.5%
l 121955
 
6.1%
o 110496
 
5.5%
c 101725
 
5.1%
Other values (29) 453998
22.8%
None
ValueCountFrequency (%)
ö 8
100.0%

cultivarEpithet
Text

Missing 

Distinct3
Distinct (%)100.0%
Missing2361470
Missing (%)> 99.9%
Memory size18.0 MiB
2025-01-08T17:46:51.170781image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length7
Mean length9
Min length7

Characters and Unicode

Total characters27
Distinct characters15
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)100.0%

Sample

1st rowOCEANIA
2nd row7707728
3rd rowLATIN_AMERICA
ValueCountFrequency (%)
oceania 1
33.3%
7707728 1
33.3%
latin_america 1
33.3%
2025-01-08T17:46:51.270955image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 5
18.5%
7 4
14.8%
I 3
11.1%
C 2
 
7.4%
E 2
 
7.4%
N 2
 
7.4%
O 1
 
3.7%
0 1
 
3.7%
2 1
 
3.7%
8 1
 
3.7%
Other values (5) 5
18.5%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 19
70.4%
Decimal Number 7
 
25.9%
Connector Punctuation 1
 
3.7%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 5
26.3%
I 3
15.8%
C 2
 
10.5%
E 2
 
10.5%
N 2
 
10.5%
O 1
 
5.3%
L 1
 
5.3%
T 1
 
5.3%
M 1
 
5.3%
R 1
 
5.3%
Decimal Number
ValueCountFrequency (%)
7 4
57.1%
0 1
 
14.3%
2 1
 
14.3%
8 1
 
14.3%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 19
70.4%
Common 8
29.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 5
26.3%
I 3
15.8%
C 2
 
10.5%
E 2
 
10.5%
N 2
 
10.5%
O 1
 
5.3%
L 1
 
5.3%
T 1
 
5.3%
M 1
 
5.3%
R 1
 
5.3%
Common
ValueCountFrequency (%)
7 4
50.0%
0 1
 
12.5%
2 1
 
12.5%
8 1
 
12.5%
_ 1
 
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 27
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 5
18.5%
7 4
14.8%
I 3
11.1%
C 2
 
7.4%
E 2
 
7.4%
N 2
 
7.4%
O 1
 
3.7%
0 1
 
3.7%
2 1
 
3.7%
8 1
 
3.7%
Other values (5) 5
18.5%
Distinct13
Distinct (%)< 0.1%
Missing10
Missing (%)< 0.1%
Memory size18.0 MiB
2025-01-08T17:46:51.318954image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length7
Mean length6.997910194
Min length3

Characters and Unicode

Total characters16525306
Distinct characters24
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st rowFAMILY
2nd rowSPECIES
3rd rowSPECIES
4th rowORDER
5th rowSPECIES
ValueCountFrequency (%)
species 1832195
77.6%
genus 185798
 
7.9%
subspecies 170097
 
7.2%
family 70045
 
3.0%
variety 50926
 
2.2%
phylum 17766
 
0.8%
class 16815
 
0.7%
order 8393
 
0.4%
kingdom 7620
 
0.3%
form 1804
 
0.1%
Other values (3) 4
 
< 0.1%
2025-01-08T17:46:51.420370image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
S 4394109
26.6%
E 4249704
25.7%
I 2130885
12.9%
P 2020058
12.2%
C 2019109
12.2%
U 373662
 
2.3%
N 193422
 
1.2%
G 193418
 
1.2%
B 170097
 
1.0%
Y 138737
 
0.8%
Other values (14) 642105
 
3.9%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 16525301
> 99.9%
Decimal Number 3
 
< 0.1%
Connector Punctuation 2
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S 4394109
26.6%
E 4249704
25.7%
I 2130885
12.9%
P 2020058
12.2%
C 2019109
12.2%
U 373662
 
2.3%
N 193422
 
1.2%
G 193418
 
1.2%
B 170097
 
1.0%
Y 138737
 
0.8%
Other values (11) 642100
 
3.9%
Decimal Number
ValueCountFrequency (%)
2 2
66.7%
0 1
33.3%
Connector Punctuation
ValueCountFrequency (%)
_ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 16525301
> 99.9%
Common 5
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
S 4394109
26.6%
E 4249704
25.7%
I 2130885
12.9%
P 2020058
12.2%
C 2019109
12.2%
U 373662
 
2.3%
N 193422
 
1.2%
G 193418
 
1.2%
B 170097
 
1.0%
Y 138737
 
0.8%
Other values (11) 642100
 
3.9%
Common
ValueCountFrequency (%)
_ 2
40.0%
2 2
40.0%
0 1
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 16525306
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S 4394109
26.6%
E 4249704
25.7%
I 2130885
12.9%
P 2020058
12.2%
C 2019109
12.2%
U 373662
 
2.3%
N 193422
 
1.2%
G 193418
 
1.2%
B 170097
 
1.0%
Y 138737
 
0.8%
Other values (14) 642105
 
3.9%

verbatimTaxonRank
Text

Missing 

Distinct3
Distinct (%)100.0%
Missing2361470
Missing (%)> 99.9%
Memory size18.0 MiB
2025-01-08T17:46:51.476230image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length51
Median length3
Mean length19
Min length3

Characters and Unicode

Total characters57
Distinct characters26
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)100.0%

Sample

1st rowAUS
2nd row414
3rd rowPlantae, Dicotyledonae (basal), Laurales, Lauraceae
ValueCountFrequency (%)
aus 1
14.3%
414 1
14.3%
plantae 1
14.3%
dicotyledonae 1
14.3%
basal 1
14.3%
laurales 1
14.3%
lauraceae 1
14.3%
2025-01-08T17:46:51.582491image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 10
17.5%
e 6
 
10.5%
4
 
7.0%
l 4
 
7.0%
, 3
 
5.3%
r 2
 
3.5%
o 2
 
3.5%
c 2
 
3.5%
s 2
 
3.5%
t 2
 
3.5%
Other values (16) 20
35.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 38
66.7%
Uppercase Letter 7
 
12.3%
Space Separator 4
 
7.0%
Other Punctuation 3
 
5.3%
Decimal Number 3
 
5.3%
Close Punctuation 1
 
1.8%
Open Punctuation 1
 
1.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 10
26.3%
e 6
15.8%
l 4
 
10.5%
r 2
 
5.3%
o 2
 
5.3%
c 2
 
5.3%
s 2
 
5.3%
t 2
 
5.3%
n 2
 
5.3%
u 2
 
5.3%
Other values (4) 4
 
10.5%
Uppercase Letter
ValueCountFrequency (%)
L 2
28.6%
A 1
14.3%
U 1
14.3%
P 1
14.3%
S 1
14.3%
D 1
14.3%
Decimal Number
ValueCountFrequency (%)
4 2
66.7%
1 1
33.3%
Space Separator
ValueCountFrequency (%)
4
100.0%
Other Punctuation
ValueCountFrequency (%)
, 3
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 45
78.9%
Common 12
 
21.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 10
22.2%
e 6
13.3%
l 4
 
8.9%
r 2
 
4.4%
o 2
 
4.4%
c 2
 
4.4%
s 2
 
4.4%
t 2
 
4.4%
n 2
 
4.4%
L 2
 
4.4%
Other values (10) 11
24.4%
Common
ValueCountFrequency (%)
4
33.3%
, 3
25.0%
4 2
16.7%
) 1
 
8.3%
( 1
 
8.3%
1 1
 
8.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 57
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 10
17.5%
e 6
 
10.5%
4
 
7.0%
l 4
 
7.0%
, 3
 
5.3%
r 2
 
3.5%
o 2
 
3.5%
c 2
 
3.5%
s 2
 
3.5%
t 2
 
3.5%
Other values (16) 20
35.1%

vernacularName
Text

Missing 

Distinct4
Distinct (%)100.0%
Missing2361469
Missing (%)> 99.9%
Memory size18.0 MiB
2025-01-08T17:46:51.630982image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length9
Median length7.5
Mean length7
Min length4

Characters and Unicode

Total characters28
Distinct characters20
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)100.0%

Sample

1st rowAustralia
2nd row8801
3rd rowHOLOTYPE
4th rowPlantae
ValueCountFrequency (%)
australia 1
25.0%
8801 1
25.0%
holotype 1
25.0%
plantae 1
25.0%
2025-01-08T17:46:51.734706image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 4
 
14.3%
O 2
 
7.1%
t 2
 
7.1%
l 2
 
7.1%
P 2
 
7.1%
8 2
 
7.1%
A 1
 
3.6%
n 1
 
3.6%
E 1
 
3.6%
Y 1
 
3.6%
Other values (10) 10
35.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 14
50.0%
Uppercase Letter 10
35.7%
Decimal Number 4
 
14.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 4
28.6%
t 2
14.3%
l 2
14.3%
n 1
 
7.1%
u 1
 
7.1%
i 1
 
7.1%
r 1
 
7.1%
s 1
 
7.1%
e 1
 
7.1%
Uppercase Letter
ValueCountFrequency (%)
O 2
20.0%
P 2
20.0%
A 1
10.0%
E 1
10.0%
Y 1
10.0%
T 1
10.0%
L 1
10.0%
H 1
10.0%
Decimal Number
ValueCountFrequency (%)
8 2
50.0%
1 1
25.0%
0 1
25.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 24
85.7%
Common 4
 
14.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 4
16.7%
O 2
 
8.3%
t 2
 
8.3%
l 2
 
8.3%
P 2
 
8.3%
A 1
 
4.2%
n 1
 
4.2%
E 1
 
4.2%
Y 1
 
4.2%
T 1
 
4.2%
Other values (7) 7
29.2%
Common
ValueCountFrequency (%)
8 2
50.0%
1 1
25.0%
0 1
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 28
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 4
 
14.3%
O 2
 
7.1%
t 2
 
7.1%
l 2
 
7.1%
P 2
 
7.1%
8 2
 
7.1%
A 1
 
3.6%
n 1
 
3.6%
E 1
 
3.6%
Y 1
 
3.6%
Other values (10) 10
35.7%

nomenclaturalCode
Text

Missing 

Distinct5
Distinct (%)100.0%
Missing2361468
Missing (%)> 99.9%
Memory size18.0 MiB
2025-01-08T17:46:51.915963image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length16
Median length15
Mean length11.4
Min length7

Characters and Unicode

Total characters57
Distinct characters32
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5 ?
Unique (%)100.0%

Sample

1st rowAUS.6_1
2nd rowHowell, Tiffany
3rd row3161935
4th rowMaccallum, G. A.
5th rowTracheophyta
ValueCountFrequency (%)
aus.6_1 1
12.5%
howell 1
12.5%
tiffany 1
12.5%
3161935 1
12.5%
maccallum 1
12.5%
g 1
12.5%
a 1
12.5%
tracheophyta 1
12.5%
2025-01-08T17:46:52.025256image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 5
 
8.8%
l 4
 
7.0%
. 3
 
5.3%
c 3
 
5.3%
1 3
 
5.3%
3
 
5.3%
A 2
 
3.5%
, 2
 
3.5%
h 2
 
3.5%
3 2
 
3.5%
Other values (22) 28
49.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 30
52.6%
Decimal Number 9
 
15.8%
Uppercase Letter 9
 
15.8%
Other Punctuation 5
 
8.8%
Space Separator 3
 
5.3%
Connector Punctuation 1
 
1.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 5
16.7%
l 4
13.3%
c 3
10.0%
h 2
 
6.7%
y 2
 
6.7%
f 2
 
6.7%
e 2
 
6.7%
o 2
 
6.7%
p 1
 
3.3%
r 1
 
3.3%
Other values (6) 6
20.0%
Uppercase Letter
ValueCountFrequency (%)
A 2
22.2%
T 2
22.2%
M 1
11.1%
S 1
11.1%
G 1
11.1%
H 1
11.1%
U 1
11.1%
Decimal Number
ValueCountFrequency (%)
1 3
33.3%
3 2
22.2%
6 2
22.2%
5 1
 
11.1%
9 1
 
11.1%
Other Punctuation
ValueCountFrequency (%)
. 3
60.0%
, 2
40.0%
Space Separator
ValueCountFrequency (%)
3
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 39
68.4%
Common 18
31.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 5
 
12.8%
l 4
 
10.3%
c 3
 
7.7%
A 2
 
5.1%
h 2
 
5.1%
y 2
 
5.1%
T 2
 
5.1%
f 2
 
5.1%
e 2
 
5.1%
o 2
 
5.1%
Other values (13) 13
33.3%
Common
ValueCountFrequency (%)
. 3
16.7%
1 3
16.7%
3
16.7%
, 2
11.1%
3 2
11.1%
6 2
11.1%
5 1
 
5.6%
9 1
 
5.6%
_ 1
 
5.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 57
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 5
 
8.8%
l 4
 
7.0%
. 3
 
5.3%
c 3
 
5.3%
1 3
 
5.3%
3
 
5.3%
A 2
 
3.5%
, 2
 
3.5%
h 2
 
3.5%
3 2
 
3.5%
Other values (22) 28
49.1%
Distinct6
Distinct (%)< 0.1%
Missing5772
Missing (%)0.2%
Memory size18.0 MiB
2025-01-08T17:46:52.074335image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length77
Median length8
Mean length7.830022146
Min length7

Characters and Unicode

Total characters18445191
Distinct characters39
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)< 0.1%

Sample

1st rowACCEPTED
2nd rowSYNONYM
3rd rowACCEPTED
4th rowACCEPTED
5th rowACCEPTED
ValueCountFrequency (%)
accepted 1936163
82.2%
synonym 400501
 
17.0%
doubtful 19034
 
0.8%
northern 1
 
< 0.1%
territory 1
 
< 0.1%
occurrence_status_inferred_from_individual_count;geodetic_datum_assumed_wgs84 1
 
< 0.1%
magnoliopsida 1
 
< 0.1%
2025-01-08T17:46:52.178679image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
E 3872333
21.0%
C 3872331
21.0%
T 1955203
10.6%
D 1955203
10.6%
A 1936167
10.5%
P 1936163
10.5%
N 801007
 
4.3%
Y 801002
 
4.3%
O 419539
 
2.3%
S 400506
 
2.2%
Other values (29) 495737
 
2.7%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 18445152
> 99.9%
Lowercase Letter 27
 
< 0.1%
Connector Punctuation 8
 
< 0.1%
Decimal Number 2
 
< 0.1%
Space Separator 1
 
< 0.1%
Other Punctuation 1
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 3872333
21.0%
C 3872331
21.0%
T 1955203
10.6%
D 1955203
10.6%
A 1936167
10.5%
P 1936163
10.5%
N 801007
 
4.3%
Y 801002
 
4.3%
O 419539
 
2.3%
S 400506
 
2.2%
Other values (10) 495698
 
2.7%
Lowercase Letter
ValueCountFrequency (%)
r 5
18.5%
o 4
14.8%
i 3
11.1%
a 2
 
7.4%
e 2
 
7.4%
n 2
 
7.4%
t 2
 
7.4%
y 1
 
3.7%
h 1
 
3.7%
g 1
 
3.7%
Other values (4) 4
14.8%
Decimal Number
ValueCountFrequency (%)
8 1
50.0%
4 1
50.0%
Connector Punctuation
ValueCountFrequency (%)
_ 8
100.0%
Space Separator
ValueCountFrequency (%)
1
100.0%
Other Punctuation
ValueCountFrequency (%)
; 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 18445179
> 99.9%
Common 12
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 3872333
21.0%
C 3872331
21.0%
T 1955203
10.6%
D 1955203
10.6%
A 1936167
10.5%
P 1936163
10.5%
N 801007
 
4.3%
Y 801002
 
4.3%
O 419539
 
2.3%
S 400506
 
2.2%
Other values (24) 495725
 
2.7%
Common
ValueCountFrequency (%)
_ 8
66.7%
1
 
8.3%
; 1
 
8.3%
8 1
 
8.3%
4 1
 
8.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 18445191
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E 3872333
21.0%
C 3872331
21.0%
T 1955203
10.6%
D 1955203
10.6%
A 1936167
10.5%
P 1936163
10.5%
N 801007
 
4.3%
Y 801002
 
4.3%
O 419539
 
2.3%
S 400506
 
2.2%
Other values (29) 495737
 
2.7%

nomenclaturalStatus
Text

Missing 

Distinct4
Distinct (%)100.0%
Missing2361469
Missing (%)> 99.9%
Memory size18.0 MiB
2025-01-08T17:46:52.224761image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length9
Mean length8.75
Min length7

Characters and Unicode

Total characters35
Distinct characters25
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)100.0%

Sample

1st rowAUS.6.12_1
2nd row5410907
3rd rowStillImage
4th rowLaurales
ValueCountFrequency (%)
aus.6.12_1 1
25.0%
5410907 1
25.0%
stillimage 1
25.0%
laurales 1
25.0%
2025-01-08T17:46:52.323436image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
l 3
 
8.6%
1 3
 
8.6%
a 3
 
8.6%
S 2
 
5.7%
. 2
 
5.7%
e 2
 
5.7%
0 2
 
5.7%
A 1
 
2.9%
r 1
 
2.9%
u 1
 
2.9%
Other values (15) 15
42.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 15
42.9%
Decimal Number 11
31.4%
Uppercase Letter 6
 
17.1%
Other Punctuation 2
 
5.7%
Connector Punctuation 1
 
2.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
l 3
20.0%
a 3
20.0%
e 2
13.3%
r 1
 
6.7%
u 1
 
6.7%
g 1
 
6.7%
m 1
 
6.7%
i 1
 
6.7%
t 1
 
6.7%
s 1
 
6.7%
Decimal Number
ValueCountFrequency (%)
1 3
27.3%
0 2
18.2%
7 1
 
9.1%
9 1
 
9.1%
4 1
 
9.1%
5 1
 
9.1%
2 1
 
9.1%
6 1
 
9.1%
Uppercase Letter
ValueCountFrequency (%)
S 2
33.3%
A 1
16.7%
L 1
16.7%
I 1
16.7%
U 1
16.7%
Other Punctuation
ValueCountFrequency (%)
. 2
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 21
60.0%
Common 14
40.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
l 3
14.3%
a 3
14.3%
S 2
 
9.5%
e 2
 
9.5%
A 1
 
4.8%
r 1
 
4.8%
u 1
 
4.8%
L 1
 
4.8%
g 1
 
4.8%
m 1
 
4.8%
Other values (5) 5
23.8%
Common
ValueCountFrequency (%)
1 3
21.4%
. 2
14.3%
0 2
14.3%
7 1
 
7.1%
9 1
 
7.1%
4 1
 
7.1%
5 1
 
7.1%
_ 1
 
7.1%
2 1
 
7.1%
6 1
 
7.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 35
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
l 3
 
8.6%
1 3
 
8.6%
a 3
 
8.6%
S 2
 
5.7%
. 2
 
5.7%
e 2
 
5.7%
0 2
 
5.7%
A 1
 
2.9%
r 1
 
2.9%
u 1
 
2.9%
Other values (15) 15
42.9%

taxonRemarks
Text

Missing 

Distinct3
Distinct (%)100.0%
Missing2361470
Missing (%)> 99.9%
Memory size18.0 MiB
2025-01-08T17:46:52.376175image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length22
Median length10
Mean length12
Min length4

Characters and Unicode

Total characters36
Distinct characters17
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)100.0%

Sample

1st rowRoper Gulf
2nd rowCampanula rotundifolia
3rd rowtrue
ValueCountFrequency (%)
roper 1
20.0%
gulf 1
20.0%
campanula 1
20.0%
rotundifolia 1
20.0%
true 1
20.0%
2025-01-08T17:46:52.488613image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 4
11.1%
u 4
11.1%
l 3
 
8.3%
o 3
 
8.3%
r 3
 
8.3%
t 2
 
5.6%
n 2
 
5.6%
f 2
 
5.6%
i 2
 
5.6%
2
 
5.6%
Other values (7) 9
25.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 31
86.1%
Uppercase Letter 3
 
8.3%
Space Separator 2
 
5.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 4
12.9%
u 4
12.9%
l 3
9.7%
o 3
9.7%
r 3
9.7%
t 2
6.5%
n 2
6.5%
f 2
6.5%
i 2
6.5%
e 2
6.5%
Other values (3) 4
12.9%
Uppercase Letter
ValueCountFrequency (%)
G 1
33.3%
C 1
33.3%
R 1
33.3%
Space Separator
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 34
94.4%
Common 2
 
5.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 4
11.8%
u 4
11.8%
l 3
8.8%
o 3
8.8%
r 3
8.8%
t 2
 
5.9%
n 2
 
5.9%
f 2
 
5.9%
i 2
 
5.9%
e 2
 
5.9%
Other values (6) 7
20.6%
Common
ValueCountFrequency (%)
2
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 36
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 4
11.1%
u 4
11.1%
l 3
 
8.3%
o 3
 
8.3%
r 3
 
8.3%
t 2
 
5.6%
n 2
 
5.6%
f 2
 
5.6%
i 2
 
5.6%
2
 
5.6%
Other values (7) 9
25.0%
Distinct4
Distinct (%)< 0.1%
Missing10
Missing (%)< 0.1%
Memory size18.0 MiB
2025-01-08T17:46:52.549174image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length36
Median length36
Mean length35.99997078
Min length5

Characters and Unicode

Total characters85012599
Distinct characters31
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)< 0.1%

Sample

1st row821cc27a-e3bb-4bc5-ac34-89ada245069d
2nd row821cc27a-e3bb-4bc5-ac34-89ada245069d
3rd row821cc27a-e3bb-4bc5-ac34-89ada245069d
4th row821cc27a-e3bb-4bc5-ac34-89ada245069d
5th row821cc27a-e3bb-4bc5-ac34-89ada245069d
ValueCountFrequency (%)
821cc27a-e3bb-4bc5-ac34-89ada245069d 2361460
> 99.9%
campanula 1
 
< 0.1%
rotundifolia 1
 
< 0.1%
l 1
 
< 0.1%
false 1
 
< 0.1%
lauraceae 1
 
< 0.1%
2025-01-08T17:46:52.660202image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 9445848
11.1%
c 9445841
11.1%
- 9445840
11.1%
2 7084380
8.3%
b 7084380
8.3%
4 7084380
8.3%
d 4722921
 
5.6%
8 4722920
 
5.6%
3 4722920
 
5.6%
5 4722920
 
5.6%
Other values (21) 16530249
19.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 42506280
50.0%
Lowercase Letter 33060473
38.9%
Dash Punctuation 9445840
 
11.1%
Uppercase Letter 3
 
< 0.1%
Space Separator 2
 
< 0.1%
Other Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 9445848
28.6%
c 9445841
28.6%
b 7084380
21.4%
d 4722921
14.3%
e 2361463
 
7.1%
u 3
 
< 0.1%
l 3
 
< 0.1%
r 2
 
< 0.1%
f 2
 
< 0.1%
i 2
 
< 0.1%
Other values (6) 8
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
2 7084380
16.7%
4 7084380
16.7%
8 4722920
11.1%
3 4722920
11.1%
5 4722920
11.1%
9 4722920
11.1%
0 2361460
 
5.6%
6 2361460
 
5.6%
7 2361460
 
5.6%
1 2361460
 
5.6%
Uppercase Letter
ValueCountFrequency (%)
L 2
66.7%
C 1
33.3%
Dash Punctuation
ValueCountFrequency (%)
- 9445840
100.0%
Space Separator
ValueCountFrequency (%)
2
100.0%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 51952123
61.1%
Latin 33060476
38.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 9445848
28.6%
c 9445841
28.6%
b 7084380
21.4%
d 4722921
14.3%
e 2361463
 
7.1%
u 3
 
< 0.1%
l 3
 
< 0.1%
r 2
 
< 0.1%
L 2
 
< 0.1%
f 2
 
< 0.1%
Other values (8) 11
 
< 0.1%
Common
ValueCountFrequency (%)
- 9445840
18.2%
2 7084380
13.6%
4 7084380
13.6%
8 4722920
9.1%
3 4722920
9.1%
5 4722920
9.1%
9 4722920
9.1%
0 2361460
 
4.5%
6 2361460
 
4.5%
7 2361460
 
4.5%
Other values (3) 2361463
 
4.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 85012599
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 9445848
11.1%
c 9445841
11.1%
- 9445840
11.1%
2 7084380
8.3%
b 7084380
8.3%
4 7084380
8.3%
d 4722921
 
5.6%
8 4722920
 
5.6%
3 4722920
 
5.6%
5 4722920
 
5.6%
Other values (21) 16530249
19.4%
Distinct3
Distinct (%)< 0.1%
Missing11
Missing (%)< 0.1%
Memory size18.0 MiB
2025-01-08T17:46:52.701843image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length22
Median length2
Mean length2.000010587
Min length2

Characters and Unicode

Total characters4722949
Distinct characters21
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st rowUS
2nd rowUS
3rd rowUS
4th rowUS
5th rowUS
ValueCountFrequency (%)
us 2361460
> 99.9%
campanula 1
 
< 0.1%
rotundifolia 1
 
< 0.1%
3155772 1
 
< 0.1%
2025-01-08T17:46:52.802513image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
U 2361460
50.0%
S 2361460
50.0%
a 4
 
< 0.1%
i 2
 
< 0.1%
7 2
 
< 0.1%
5 2
 
< 0.1%
n 2
 
< 0.1%
u 2
 
< 0.1%
l 2
 
< 0.1%
o 2
 
< 0.1%
Other values (11) 11
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 4722921
> 99.9%
Lowercase Letter 20
 
< 0.1%
Decimal Number 7
 
< 0.1%
Space Separator 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 4
20.0%
i 2
10.0%
n 2
10.0%
u 2
10.0%
l 2
10.0%
o 2
10.0%
f 1
 
5.0%
r 1
 
5.0%
d 1
 
5.0%
t 1
 
5.0%
Other values (2) 2
10.0%
Decimal Number
ValueCountFrequency (%)
7 2
28.6%
5 2
28.6%
1 1
14.3%
3 1
14.3%
2 1
14.3%
Uppercase Letter
ValueCountFrequency (%)
U 2361460
50.0%
S 2361460
50.0%
C 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4722941
> 99.9%
Common 8
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
U 2361460
50.0%
S 2361460
50.0%
a 4
 
< 0.1%
i 2
 
< 0.1%
n 2
 
< 0.1%
u 2
 
< 0.1%
l 2
 
< 0.1%
o 2
 
< 0.1%
f 1
 
< 0.1%
r 1
 
< 0.1%
Other values (5) 5
 
< 0.1%
Common
ValueCountFrequency (%)
7 2
25.0%
5 2
25.0%
1 1
12.5%
3 1
12.5%
1
12.5%
2 1
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4722949
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
U 2361460
50.0%
S 2361460
50.0%
a 4
 
< 0.1%
i 2
 
< 0.1%
7 2
 
< 0.1%
5 2
 
< 0.1%
n 2
 
< 0.1%
u 2
 
< 0.1%
l 2
 
< 0.1%
o 2
 
< 0.1%
Other values (11) 11
 
< 0.1%
Distinct210763
Distinct (%)8.9%
Missing10
Missing (%)< 0.1%
Memory size18.0 MiB
2025-01-08T17:46:52.948920image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length24
Median length24
Mean length23.99606473
Min length2

Characters and Unicode

Total characters56665819
Distinct characters19
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7659 ?
Unique (%)0.3%

Sample

1st row2024-12-02T13:59:36.683Z
2nd row2024-12-02T13:59:14.817Z
3rd row2024-12-02T13:57:42.802Z
4th row2024-12-02T13:59:13.837Z
5th row2024-12-02T13:57:45.358Z
ValueCountFrequency (%)
2024-12-02t13:57:25.039z 46
 
< 0.1%
2024-12-02t13:57:24.083z 45
 
< 0.1%
2024-12-02t13:57:28.833z 45
 
< 0.1%
2024-12-02t13:57:45.003z 45
 
< 0.1%
2024-12-02t13:57:52.915z 44
 
< 0.1%
2024-12-02t13:57:34.491z 44
 
< 0.1%
2024-12-02t13:57:52.924z 43
 
< 0.1%
2024-12-02t13:57:43.166z 43
 
< 0.1%
2024-12-02t13:57:52.893z 42
 
< 0.1%
2024-12-02t13:57:42.743z 42
 
< 0.1%
Other values (210753) 2361024
> 99.9%
2025-01-08T17:46:53.152326image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 10789974
19.0%
0 5989862
10.6%
1 5962966
10.5%
- 4722920
8.3%
: 4722920
8.3%
4 3794952
 
6.7%
5 3740549
 
6.6%
3 3738232
 
6.6%
Z 2361460
 
4.2%
T 2361460
 
4.2%
Other values (9) 8480524
15.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 40137903
70.8%
Other Punctuation 7082072
 
12.5%
Uppercase Letter 4722924
 
8.3%
Dash Punctuation 4722920
 
8.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 10789974
26.9%
0 5989862
14.9%
1 5962966
14.9%
4 3794952
 
9.5%
5 3740549
 
9.3%
3 3738232
 
9.3%
7 1827734
 
4.6%
9 1516897
 
3.8%
6 1417013
 
3.5%
8 1359724
 
3.4%
Uppercase Letter
ValueCountFrequency (%)
Z 2361460
50.0%
T 2361460
50.0%
N 1
 
< 0.1%
E 1
 
< 0.1%
L 1
 
< 0.1%
C 1
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
: 4722920
66.7%
. 2359152
33.3%
Dash Punctuation
ValueCountFrequency (%)
- 4722920
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 51942895
91.7%
Latin 4722924
 
8.3%

Most frequent character per script

Common
ValueCountFrequency (%)
2 10789974
20.8%
0 5989862
11.5%
1 5962966
11.5%
- 4722920
9.1%
: 4722920
9.1%
4 3794952
 
7.3%
5 3740549
 
7.2%
3 3738232
 
7.2%
. 2359152
 
4.5%
7 1827734
 
3.5%
Other values (3) 4293634
 
8.3%
Latin
ValueCountFrequency (%)
Z 2361460
50.0%
T 2361460
50.0%
N 1
 
< 0.1%
E 1
 
< 0.1%
L 1
 
< 0.1%
C 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 56665819
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 10789974
19.0%
0 5989862
10.6%
1 5962966
10.5%
- 4722920
8.3%
: 4722920
8.3%
4 3794952
 
6.7%
5 3740549
 
6.6%
3 3738232
 
6.6%
Z 2361460
 
4.2%
T 2361460
 
4.2%
Other values (9) 8480524
15.0%

elevation
Text

Missing 

Distinct5275
Distinct (%)1.0%
Missing1813940
Missing (%)76.8%
Memory size18.0 MiB
2025-01-08T17:46:53.347809image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length18
Median length17
Mean length5.344459603
Min length1

Characters and Unicode

Total characters2926268
Distinct characters15
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique992 ?
Unique (%)0.2%

Sample

1st row1097.5
2nd row140.0
3rd row2880.0
4th row1219.0
5th row1100.0
ValueCountFrequency (%)
1000.0 7671
 
1.4%
100.0 7194
 
1.3%
200.0 6722
 
1.2%
500.0 6255
 
1.1%
300.0 6128
 
1.1%
1500.0 5575
 
1.0%
800.0 5408
 
1.0%
900.0 5284
 
1.0%
1200.0 5262
 
1.0%
400.0 5236
 
1.0%
Other values (5246) 486798
88.9%
2025-01-08T17:46:53.612656image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 1005275
34.4%
. 547531
18.7%
1 294369
 
10.1%
5 218789
 
7.5%
2 216684
 
7.4%
3 143662
 
4.9%
4 113390
 
3.9%
7 104322
 
3.6%
6 101282
 
3.5%
8 94460
 
3.2%
Other values (5) 86504
 
3.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2378653
81.3%
Other Punctuation 547531
 
18.7%
Dash Punctuation 81
 
< 0.1%
Uppercase Letter 3
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 1005275
42.3%
1 294369
 
12.4%
5 218789
 
9.2%
2 216684
 
9.1%
3 143662
 
6.0%
4 113390
 
4.8%
7 104322
 
4.4%
6 101282
 
4.3%
8 94460
 
4.0%
9 86420
 
3.6%
Uppercase Letter
ValueCountFrequency (%)
E 1
33.3%
M 1
33.3%
L 1
33.3%
Other Punctuation
ValueCountFrequency (%)
. 547531
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 81
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2926265
> 99.9%
Latin 3
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
0 1005275
34.4%
. 547531
18.7%
1 294369
 
10.1%
5 218789
 
7.5%
2 216684
 
7.4%
3 143662
 
4.9%
4 113390
 
3.9%
7 104322
 
3.6%
6 101282
 
3.5%
8 94460
 
3.2%
Other values (2) 86501
 
3.0%
Latin
ValueCountFrequency (%)
E 1
33.3%
M 1
33.3%
L 1
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2926268
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 1005275
34.4%
. 547531
18.7%
1 294369
 
10.1%
5 218789
 
7.5%
2 216684
 
7.4%
3 143662
 
4.9%
4 113390
 
3.9%
7 104322
 
3.6%
6 101282
 
3.5%
8 94460
 
3.2%
Other values (5) 86504
 
3.0%

elevationAccuracy
Text

Missing 

Distinct941
Distinct (%)0.5%
Missing2160162
Missing (%)91.5%
Memory size18.0 MiB
2025-01-08T17:46:53.787333image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length24
Median length3
Mean length3.726492839
Min length3

Characters and Unicode

Total characters750184
Distinct characters20
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique301 ?
Unique (%)0.1%

Sample

1st row48.5
2nd row0.0
3rd row75.0
4th row50.0
5th row0.0
ValueCountFrequency (%)
0.0 94148
46.8%
50.0 14946
 
7.4%
100.0 9655
 
4.8%
150.0 6530
 
3.2%
25.0 6433
 
3.2%
75.0 3885
 
1.9%
200.0 3746
 
1.9%
152.5 3332
 
1.7%
15.0 2755
 
1.4%
10.0 2349
 
1.2%
Other values (931) 53532
26.6%
2025-01-08T17:46:54.011423image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 345550
46.1%
. 201303
26.8%
5 78571
 
10.5%
1 40957
 
5.5%
2 31510
 
4.2%
3 14551
 
1.9%
7 13509
 
1.8%
4 7670
 
1.0%
6 7628
 
1.0%
8 5571
 
0.7%
Other values (10) 3364
 
0.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 548869
73.2%
Other Punctuation 201305
 
26.8%
Lowercase Letter 5
 
< 0.1%
Uppercase Letter 3
 
< 0.1%
Dash Punctuation 2
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 345550
63.0%
5 78571
 
14.3%
1 40957
 
7.5%
2 31510
 
5.7%
3 14551
 
2.7%
7 13509
 
2.5%
4 7670
 
1.4%
6 7628
 
1.4%
8 5571
 
1.0%
9 3352
 
0.6%
Lowercase Letter
ValueCountFrequency (%)
e 2
40.0%
r 1
20.0%
s 1
20.0%
a 1
20.0%
Uppercase Letter
ValueCountFrequency (%)
P 1
33.3%
T 1
33.3%
Z 1
33.3%
Other Punctuation
ValueCountFrequency (%)
. 201303
> 99.9%
: 2
 
< 0.1%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 750176
> 99.9%
Latin 8
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
0 345550
46.1%
. 201303
26.8%
5 78571
 
10.5%
1 40957
 
5.5%
2 31510
 
4.2%
3 14551
 
1.9%
7 13509
 
1.8%
4 7670
 
1.0%
6 7628
 
1.0%
8 5571
 
0.7%
Other values (3) 3356
 
0.4%
Latin
ValueCountFrequency (%)
e 2
25.0%
P 1
12.5%
r 1
12.5%
s 1
12.5%
a 1
12.5%
T 1
12.5%
Z 1
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 750184
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 345550
46.1%
. 201303
26.8%
5 78571
 
10.5%
1 40957
 
5.5%
2 31510
 
4.2%
3 14551
 
1.9%
7 13509
 
1.8%
4 7670
 
1.0%
6 7628
 
1.0%
8 5571
 
0.7%
Other values (10) 3364
 
0.4%

depth
Text

Missing 

Distinct6333
Distinct (%)2.4%
Missing2098489
Missing (%)88.9%
Memory size18.0 MiB
2025-01-08T17:46:54.196706image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length24
Median length19
Mean length4.326179539
Min length3

Characters and Unicode

Total characters1137716
Distinct characters20
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1835 ?
Unique (%)0.7%

Sample

1st row9.1
2nd row200.0
3rd row3200.0
4th row40.0
5th row824.0
ValueCountFrequency (%)
0.5 9154
 
3.5%
1.0 5358
 
2.0%
18.0 4141
 
1.6%
3.0 4066
 
1.5%
1.5 3594
 
1.4%
12.0 3107
 
1.2%
2.0 2972
 
1.1%
6.0 2906
 
1.1%
15.0 2686
 
1.0%
24.0 2578
 
1.0%
Other values (6323) 222422
84.6%
2025-01-08T17:46:54.438579image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 279518
24.6%
. 262982
23.1%
5 110628
 
9.7%
1 109984
 
9.7%
2 82032
 
7.2%
3 61134
 
5.4%
4 55715
 
4.9%
6 46039
 
4.0%
8 45736
 
4.0%
7 43807
 
3.9%
Other values (10) 40141
 
3.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 874722
76.9%
Other Punctuation 262984
 
23.1%
Lowercase Letter 5
 
< 0.1%
Uppercase Letter 3
 
< 0.1%
Dash Punctuation 2
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 279518
32.0%
5 110628
 
12.6%
1 109984
 
12.6%
2 82032
 
9.4%
3 61134
 
7.0%
4 55715
 
6.4%
6 46039
 
5.3%
8 45736
 
5.2%
7 43807
 
5.0%
9 40129
 
4.6%
Lowercase Letter
ValueCountFrequency (%)
e 2
40.0%
r 1
20.0%
s 1
20.0%
a 1
20.0%
Uppercase Letter
ValueCountFrequency (%)
P 1
33.3%
T 1
33.3%
Z 1
33.3%
Other Punctuation
ValueCountFrequency (%)
. 262982
> 99.9%
: 2
 
< 0.1%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1137708
> 99.9%
Latin 8
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
0 279518
24.6%
. 262982
23.1%
5 110628
 
9.7%
1 109984
 
9.7%
2 82032
 
7.2%
3 61134
 
5.4%
4 55715
 
4.9%
6 46039
 
4.0%
8 45736
 
4.0%
7 43807
 
3.9%
Other values (3) 40133
 
3.5%
Latin
ValueCountFrequency (%)
e 2
25.0%
P 1
12.5%
r 1
12.5%
s 1
12.5%
a 1
12.5%
T 1
12.5%
Z 1
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1137716
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 279518
24.6%
. 262982
23.1%
5 110628
 
9.7%
1 109984
 
9.7%
2 82032
 
7.2%
3 61134
 
5.4%
4 55715
 
4.9%
6 46039
 
4.0%
8 45736
 
4.0%
7 43807
 
3.9%
Other values (10) 40141
 
3.5%

depthAccuracy
Text

Missing 

Distinct1495
Distinct (%)0.6%
Missing2120420
Missing (%)89.8%
Memory size18.0 MiB
2025-01-08T17:46:54.616901image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length21
Median length3
Mean length3.319236848
Min length3

Characters and Unicode

Total characters800112
Distinct characters15
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique326 ?
Unique (%)0.1%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row10.0
5th row20.0
ValueCountFrequency (%)
0.0 142418
59.1%
0.5 12237
 
5.1%
3.0 8209
 
3.4%
1.0 7211
 
3.0%
1.5 6496
 
2.7%
2.0 4403
 
1.8%
2.5 4398
 
1.8%
5.0 2956
 
1.2%
4.5 2477
 
1.0%
3.5 1687
 
0.7%
Other values (1485) 48561
 
20.1%
2025-01-08T17:46:54.843332image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 372794
46.6%
. 241051
30.1%
5 62907
 
7.9%
1 31462
 
3.9%
2 22080
 
2.8%
9 20707
 
2.6%
3 17061
 
2.1%
4 11040
 
1.4%
7 8028
 
1.0%
6 7619
 
1.0%
Other values (5) 5363
 
0.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 559057
69.9%
Other Punctuation 241051
30.1%
Lowercase Letter 4
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 372794
66.7%
5 62907
 
11.3%
1 31462
 
5.6%
2 22080
 
3.9%
9 20707
 
3.7%
3 17061
 
3.1%
4 11040
 
2.0%
7 8028
 
1.4%
6 7619
 
1.4%
8 5359
 
1.0%
Lowercase Letter
ValueCountFrequency (%)
t 1
25.0%
r 1
25.0%
u 1
25.0%
e 1
25.0%
Other Punctuation
ValueCountFrequency (%)
. 241051
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 800108
> 99.9%
Latin 4
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
0 372794
46.6%
. 241051
30.1%
5 62907
 
7.9%
1 31462
 
3.9%
2 22080
 
2.8%
9 20707
 
2.6%
3 17061
 
2.1%
4 11040
 
1.4%
7 8028
 
1.0%
6 7619
 
1.0%
Latin
ValueCountFrequency (%)
t 1
25.0%
r 1
25.0%
u 1
25.0%
e 1
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 800112
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 372794
46.6%
. 241051
30.1%
5 62907
 
7.9%
1 31462
 
3.9%
2 22080
 
2.8%
9 20707
 
2.6%
3 17061
 
2.1%
4 11040
 
1.4%
7 8028
 
1.0%
6 7619
 
1.0%
Other values (5) 5363
 
0.7%
Distinct910
Distinct (%)19.6%
Missing2356831
Missing (%)99.8%
Memory size18.0 MiB
2025-01-08T17:46:54.977772image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length18
Median length17
Mean length14.57712193
Min length3

Characters and Unicode

Total characters67667
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique466 ?
Unique (%)10.0%

Sample

1st row365.13018771678105
2nd row0.0
3rd row3.650579245692265
4th row0.0
5th row0.0
ValueCountFrequency (%)
0.0 906
 
19.5%
511.15289545417056 224
 
4.8%
4105.643932903784 143
 
3.1%
365.9456782615661 97
 
2.1%
2063.191632254214 87
 
1.9%
4961.494346970892 60
 
1.3%
2015.7207067821585 54
 
1.2%
1436.265124532336 53
 
1.1%
949.7490617483568 46
 
1.0%
3997.886559051776 41
 
0.9%
Other values (900) 2931
63.1%
2025-01-08T17:46:55.186167image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4 7071
10.4%
0 6998
10.3%
5 6883
10.2%
1 6804
10.1%
2 6313
9.3%
3 6228
9.2%
6 5979
8.8%
8 5873
8.7%
9 5775
8.5%
7 5102
7.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 63026
93.1%
Other Punctuation 4641
 
6.9%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
4 7071
11.2%
0 6998
11.1%
5 6883
10.9%
1 6804
10.8%
2 6313
10.0%
3 6228
9.9%
6 5979
9.5%
8 5873
9.3%
9 5775
9.2%
7 5102
8.1%
Other Punctuation
ValueCountFrequency (%)
. 4641
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 67667
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
4 7071
10.4%
0 6998
10.3%
5 6883
10.2%
1 6804
10.1%
2 6313
9.3%
3 6228
9.2%
6 5979
8.8%
8 5873
8.7%
9 5775
8.5%
7 5102
7.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 67667
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4 7071
10.4%
0 6998
10.3%
5 6883
10.2%
1 6804
10.1%
2 6313
9.3%
3 6228
9.2%
6 5979
8.8%
8 5873
8.7%
9 5775
8.5%
7 5102
7.5%

issue
Text

Distinct543
Distinct (%)< 0.1%
Missing858
Missing (%)< 0.1%
Memory size18.0 MiB
2025-01-08T17:46:55.260343image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length210
Median length48
Mean length67.70638584
Min length7

Characters and Unicode

Total characters159828710
Distinct characters44
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique128 ?
Unique (%)< 0.1%

Sample

1st rowOCCURRENCE_STATUS_INFERRED_FROM_INDIVIDUAL_COUNT;GEODETIC_DATUM_ASSUMED_WGS84;CONTINENT_DERIVED_FROM_COORDINATES;CONTINENT_INVALID
2nd rowOCCURRENCE_STATUS_INFERRED_FROM_INDIVIDUAL_COUNT
3rd rowOCCURRENCE_STATUS_INFERRED_FROM_INDIVIDUAL_COUNT;GEODETIC_DATUM_ASSUMED_WGS84
4th rowOCCURRENCE_STATUS_INFERRED_FROM_INDIVIDUAL_COUNT
5th rowOCCURRENCE_STATUS_INFERRED_FROM_INDIVIDUAL_COUNT;GEODETIC_DATUM_ASSUMED_WGS84;CONTINENT_INVALID
ValueCountFrequency (%)
occurrence_status_inferred_from_individual_count 1322725
56.0%
occurrence_status_inferred_from_individual_count;geodetic_datum_assumed_wgs84 251044
 
10.6%
occurrence_status_inferred_from_individual_count;geodetic_datum_assumed_wgs84;continent_invalid 152222
 
6.4%
occurrence_status_inferred_from_individual_count;continent_derived_from_country;continent_invalid 105605
 
4.5%
occurrence_status_inferred_from_individual_count;geodetic_datum_assumed_wgs84;continent_derived_from_coordinates;continent_invalid 86812
 
3.7%
occurrence_status_inferred_from_individual_count;geodetic_datum_assumed_wgs84;continent_derived_from_coordinates 76729
 
3.3%
occurrence_status_inferred_from_individual_count;taxon_match_higherrank 71534
 
3.0%
occurrence_status_inferred_from_individual_count;continent_derived_from_country 67959
 
2.9%
occurrence_status_inferred_from_individual_count;taxon_match_fuzzy 25595
 
1.1%
occurrence_status_inferred_from_individual_count;recorded_date_mismatch 25313
 
1.1%
Other values (533) 175077
 
7.4%
2025-01-08T17:46:55.400192image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
_ 15993110
10.0%
E 13659218
 
8.5%
R 13467823
 
8.4%
N 13134550
 
8.2%
I 12670207
 
7.9%
C 11725720
 
7.3%
U 11078299
 
6.9%
T 11054854
 
6.9%
D 10756285
 
6.7%
O 9975135
 
6.2%
Other values (34) 36313509
22.7%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 140757398
88.1%
Connector Punctuation 15993110
 
10.0%
Other Punctuation 1760707
 
1.1%
Decimal Number 1317486
 
0.8%
Lowercase Letter 9
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 13659218
9.7%
R 13467823
9.6%
N 13134550
9.3%
I 12670207
9.0%
C 11725720
8.3%
U 11078299
7.9%
T 11054854
7.9%
D 10756285
7.6%
O 9975135
 
7.1%
A 7311251
 
5.2%
Other values (15) 25924056
18.4%
Decimal Number
ValueCountFrequency (%)
8 658738
50.0%
4 658738
50.0%
5 2
 
< 0.1%
3 2
 
< 0.1%
6 1
 
< 0.1%
1 1
 
< 0.1%
7 1
 
< 0.1%
9 1
 
< 0.1%
2 1
 
< 0.1%
0 1
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
a 3
33.3%
i 1
 
11.1%
n 1
 
11.1%
c 1
 
11.1%
m 1
 
11.1%
r 1
 
11.1%
e 1
 
11.1%
Connector Punctuation
ValueCountFrequency (%)
_ 15993110
100.0%
Other Punctuation
ValueCountFrequency (%)
; 1760707
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 140757407
88.1%
Common 19071303
 
11.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 13659218
9.7%
R 13467823
9.6%
N 13134550
9.3%
I 12670207
9.0%
C 11725720
8.3%
U 11078299
7.9%
T 11054854
7.9%
D 10756285
7.6%
O 9975135
 
7.1%
A 7311251
 
5.2%
Other values (22) 25924065
18.4%
Common
ValueCountFrequency (%)
_ 15993110
83.9%
; 1760707
 
9.2%
8 658738
 
3.5%
4 658738
 
3.5%
5 2
 
< 0.1%
3 2
 
< 0.1%
6 1
 
< 0.1%
1 1
 
< 0.1%
7 1
 
< 0.1%
9 1
 
< 0.1%
Other values (2) 2
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 159828710
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
_ 15993110
10.0%
E 13659218
 
8.5%
R 13467823
 
8.4%
N 13134550
 
8.2%
I 12670207
 
7.9%
C 11725720
 
7.3%
U 11078299
 
6.9%
T 11054854
 
6.9%
D 10756285
 
6.7%
O 9975135
 
6.2%
Other values (34) 36313509
22.7%

mediaType
Text

Missing 

Distinct59
Distinct (%)< 0.1%
Missing863248
Missing (%)36.6%
Memory size18.0 MiB
2025-01-08T17:46:55.457965image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length1011
Median length10
Mean length11.32266182
Min length5

Characters and Unicode

Total characters16963895
Distinct characters12
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9 ?
Unique (%)< 0.1%

Sample

1st rowStillImage
2nd rowStillImage
3rd rowStillImage
4th rowStillImage
5th rowStillImage
ValueCountFrequency (%)
stillimage 1393845
93.0%
stillimage;stillimage 79719
 
5.3%
stillimage;stillimage;stillimage 8722
 
0.6%
stillimage;stillimage;stillimage;stillimage 7143
 
0.5%
stillimage;stillimage;stillimage;stillimage;stillimage 2786
 
0.2%
stillimage;stillimage;stillimage;stillimage;stillimage;stillimage 2282
 
0.2%
stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage 958
 
0.1%
stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage 570
 
< 0.1%
stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage 531
 
< 0.1%
stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage 332
 
< 0.1%
Other values (49) 1337
 
0.1%
2025-01-08T17:46:55.591100image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
l 3356749
19.8%
a 1678375
9.9%
e 1678375
9.9%
S 1678374
9.9%
t 1678374
9.9%
i 1678374
9.9%
I 1678374
9.9%
m 1678374
9.9%
g 1678374
9.9%
; 180150
 
1.1%
Other values (2) 2
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 13426997
79.2%
Uppercase Letter 3356748
 
19.8%
Other Punctuation 180150
 
1.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
l 3356749
25.0%
a 1678375
12.5%
e 1678375
12.5%
t 1678374
12.5%
i 1678374
12.5%
m 1678374
12.5%
g 1678374
12.5%
f 1
 
< 0.1%
s 1
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
S 1678374
50.0%
I 1678374
50.0%
Other Punctuation
ValueCountFrequency (%)
; 180150
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 16783745
98.9%
Common 180150
 
1.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
l 3356749
20.0%
a 1678375
10.0%
e 1678375
10.0%
S 1678374
10.0%
t 1678374
10.0%
i 1678374
10.0%
I 1678374
10.0%
m 1678374
10.0%
g 1678374
10.0%
f 1
 
< 0.1%
Common
ValueCountFrequency (%)
; 180150
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 16963895
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
l 3356749
19.8%
a 1678375
9.9%
e 1678375
9.9%
S 1678374
9.9%
t 1678374
9.9%
i 1678374
9.9%
I 1678374
9.9%
m 1678374
9.9%
g 1678374
9.9%
; 180150
 
1.1%
Other values (2) 2
 
< 0.1%
Distinct10
Distinct (%)< 0.1%
Missing5
Missing (%)< 0.1%
Memory size18.0 MiB
2025-01-08T17:46:55.652519image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length48
Median length5
Mean length4.698685309
Min length4

Characters and Unicode

Total characters11095795
Distinct characters57
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8 ?
Unique (%)< 0.1%

Sample

1st rowtrue
2nd rowfalse
3rd rowtrue
4th rowfalse
5th rowtrue
ValueCountFrequency (%)
false 1649753
69.9%
true 711707
30.1%
1914 1
 
< 0.1%
mitchill 1
 
< 0.1%
bilinearis 1
 
< 0.1%
merluccius 1
 
< 0.1%
greene 1
 
< 0.1%
blumeri 1
 
< 0.1%
senecio 1
 
< 0.1%
1900 1
 
< 0.1%
Other values (17) 17
 
< 0.1%
2025-01-08T17:46:55.767608image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 2361474
21.3%
l 1649761
14.9%
s 1649759
14.9%
a 1649759
14.9%
f 1649753
14.9%
r 711716
 
6.4%
u 711716
 
6.4%
t 711714
 
6.4%
17
 
< 0.1%
i 16
 
< 0.1%
Other values (47) 110
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 11095715
> 99.9%
Decimal Number 27
 
< 0.1%
Uppercase Letter 25
 
< 0.1%
Space Separator 17
 
< 0.1%
Other Punctuation 6
 
< 0.1%
Open Punctuation 2
 
< 0.1%
Close Punctuation 2
 
< 0.1%
Connector Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 2361474
21.3%
l 1649761
14.9%
s 1649759
14.9%
a 1649759
14.9%
f 1649753
14.9%
r 711716
 
6.4%
u 711716
 
6.4%
t 711714
 
6.4%
i 16
 
< 0.1%
o 8
 
< 0.1%
Other values (13) 39
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
H 3
 
12.0%
M 3
 
12.0%
A 2
 
8.0%
R 2
 
8.0%
L 1
 
4.0%
D 1
 
4.0%
S 1
 
4.0%
Z 1
 
4.0%
O 1
 
4.0%
P 1
 
4.0%
Other values (9) 9
36.0%
Decimal Number
ValueCountFrequency (%)
1 8
29.6%
9 4
14.8%
8 3
 
11.1%
7 3
 
11.1%
0 3
 
11.1%
4 2
 
7.4%
5 2
 
7.4%
3 1
 
3.7%
2 1
 
3.7%
Other Punctuation
ValueCountFrequency (%)
, 5
83.3%
& 1
 
16.7%
Space Separator
ValueCountFrequency (%)
17
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 11095740
> 99.9%
Common 55
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 2361474
21.3%
l 1649761
14.9%
s 1649759
14.9%
a 1649759
14.9%
f 1649753
14.9%
r 711716
 
6.4%
u 711716
 
6.4%
t 711714
 
6.4%
i 16
 
< 0.1%
o 8
 
< 0.1%
Other values (32) 64
 
< 0.1%
Common
ValueCountFrequency (%)
17
30.9%
1 8
14.5%
, 5
 
9.1%
9 4
 
7.3%
8 3
 
5.5%
7 3
 
5.5%
0 3
 
5.5%
( 2
 
3.6%
4 2
 
3.6%
) 2
 
3.6%
Other values (5) 6
 
10.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 11095794
> 99.9%
None 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 2361474
21.3%
l 1649761
14.9%
s 1649759
14.9%
a 1649759
14.9%
f 1649753
14.9%
r 711716
 
6.4%
u 711716
 
6.4%
t 711714
 
6.4%
17
 
< 0.1%
i 16
 
< 0.1%
Other values (46) 109
 
< 0.1%
None
ValueCountFrequency (%)
ö 1
100.0%
Distinct5
Distinct (%)< 0.1%
Missing10
Missing (%)< 0.1%
Memory size18.0 MiB
2025-01-08T17:46:55.816275image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length18
Median length5
Mean length4.993394349
Min length4

Characters and Unicode

Total characters11791716
Distinct characters27
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)< 0.1%

Sample

1st rowfalse
2nd rowfalse
3rd rowfalse
4th rowfalse
5th rowfalse
ValueCountFrequency (%)
false 2345838
99.3%
true 15622
 
0.7%
north_america 1
 
< 0.1%
guatteria 1
 
< 0.1%
punctata 1
 
< 0.1%
species 1
 
< 0.1%
2025-01-08T17:46:55.917835image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 2361461
20.0%
a 2345842
19.9%
f 2345838
19.9%
l 2345838
19.9%
s 2345838
19.9%
t 15626
 
0.1%
u 15624
 
0.1%
r 15623
 
0.1%
E 3
 
< 0.1%
I 2
 
< 0.1%
Other values (17) 21
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 11791694
> 99.9%
Uppercase Letter 20
 
< 0.1%
Space Separator 1
 
< 0.1%
Connector Punctuation 1
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 3
15.0%
I 2
10.0%
A 2
10.0%
C 2
10.0%
R 2
10.0%
S 2
10.0%
G 1
 
5.0%
M 1
 
5.0%
H 1
 
5.0%
T 1
 
5.0%
Other values (3) 3
15.0%
Lowercase Letter
ValueCountFrequency (%)
e 2361461
20.0%
a 2345842
19.9%
f 2345838
19.9%
l 2345838
19.9%
s 2345838
19.9%
t 15626
 
0.1%
u 15624
 
0.1%
r 15623
 
0.1%
p 1
 
< 0.1%
n 1
 
< 0.1%
Other values (2) 2
 
< 0.1%
Space Separator
ValueCountFrequency (%)
1
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 11791714
> 99.9%
Common 2
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 2361461
20.0%
a 2345842
19.9%
f 2345838
19.9%
l 2345838
19.9%
s 2345838
19.9%
t 15626
 
0.1%
u 15624
 
0.1%
r 15623
 
0.1%
E 3
 
< 0.1%
I 2
 
< 0.1%
Other values (15) 19
 
< 0.1%
Common
ValueCountFrequency (%)
1
50.0%
_ 1
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 11791716
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 2361461
20.0%
a 2345842
19.9%
f 2345838
19.9%
l 2345838
19.9%
s 2345838
19.9%
t 15626
 
0.1%
u 15624
 
0.1%
r 15623
 
0.1%
E 3
 
< 0.1%
I 2
 
< 0.1%
Other values (17) 21
 
< 0.1%
Distinct362006
Distinct (%)15.3%
Missing12
Missing (%)< 0.1%
Memory size18.0 MiB
2025-01-08T17:46:56.192654image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length37
Median length7
Mean length6.857343822
Min length1

Characters and Unicode

Total characters16193350
Distinct characters32
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique179314 ?
Unique (%)7.6%

Sample

1st row3869
2nd row5374585
3rd row2431199
4th row714
5th row2322812
ValueCountFrequency (%)
2431491 19390
 
0.8%
225 6083
 
0.3%
0 5762
 
0.2%
8176985 4732
 
0.2%
5967481 3865
 
0.2%
2437967 3463
 
0.1%
2431539 3260
 
0.1%
2440447 2983
 
0.1%
105 2810
 
0.1%
1340278 2739
 
0.1%
Other values (361999) 2306377
97.7%
2025-01-08T17:46:56.534996image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 2449798
15.1%
3 1788146
11.0%
4 1653847
10.2%
1 1611024
9.9%
5 1582086
9.8%
7 1533979
9.5%
6 1419285
8.8%
8 1406069
8.7%
9 1386672
8.6%
0 1362407
8.4%
Other values (22) 37
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 16193313
> 99.9%
Lowercase Letter 24
 
< 0.1%
Uppercase Letter 5
 
< 0.1%
Other Punctuation 3
 
< 0.1%
Space Separator 3
 
< 0.1%
Close Punctuation 1
 
< 0.1%
Open Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 5
20.8%
t 4
16.7%
u 3
12.5%
r 2
 
8.3%
o 1
 
4.2%
l 1
 
4.2%
w 1
 
4.2%
i 1
 
4.2%
b 1
 
4.2%
c 1
 
4.2%
Other values (4) 4
16.7%
Decimal Number
ValueCountFrequency (%)
2 2449798
15.1%
3 1788146
11.0%
4 1653847
10.2%
1 1611024
9.9%
5 1582086
9.8%
7 1533979
9.5%
6 1419285
8.8%
8 1406069
8.7%
9 1386672
8.6%
0 1362407
8.4%
Uppercase Letter
ValueCountFrequency (%)
A 2
40.0%
H 1
20.0%
R 1
20.0%
G 1
20.0%
Other Punctuation
ValueCountFrequency (%)
. 3
100.0%
Space Separator
ValueCountFrequency (%)
3
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 16193321
> 99.9%
Latin 29
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 5
17.2%
t 4
13.8%
u 3
 
10.3%
r 2
 
6.9%
A 2
 
6.9%
o 1
 
3.4%
l 1
 
3.4%
H 1
 
3.4%
R 1
 
3.4%
w 1
 
3.4%
Other values (8) 8
27.6%
Common
ValueCountFrequency (%)
2 2449798
15.1%
3 1788146
11.0%
4 1653847
10.2%
1 1611024
9.9%
5 1582086
9.8%
7 1533979
9.5%
6 1419285
8.8%
8 1406069
8.7%
9 1386672
8.6%
0 1362407
8.4%
Other values (4) 8
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 16193350
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 2449798
15.1%
3 1788146
11.0%
4 1653847
10.2%
1 1611024
9.9%
5 1582086
9.8%
7 1533979
9.5%
6 1419285
8.8%
8 1406069
8.7%
9 1386672
8.6%
0 1362407
8.4%
Other values (22) 37
 
< 0.1%
Distinct315017
Distinct (%)13.4%
Missing5774
Missing (%)0.2%
Memory size18.0 MiB
2025-01-08T17:46:56.794305image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length18
Median length7
Mean length6.879705769
Min length1

Characters and Unicode

Total characters16206516
Distinct characters21
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique142213 ?
Unique (%)6.0%

Sample

1st row3869
2nd row3044413
3rd row2431199
4th row714
5th row2322812
ValueCountFrequency (%)
2431491 19390
 
0.8%
225 6083
 
0.3%
7947184 4743
 
0.2%
5967481 3865
 
0.2%
2437967 3815
 
0.2%
2431539 3260
 
0.1%
2440447 2987
 
0.1%
105 2810
 
0.1%
1340278 2739
 
0.1%
2431224 2562
 
0.1%
Other values (315008) 2303446
97.8%
2025-01-08T17:46:57.108344image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 2439480
15.1%
3 1778291
11.0%
4 1658961
10.2%
1 1632514
10.1%
5 1571567
9.7%
7 1538551
9.5%
8 1409644
8.7%
6 1402728
8.7%
9 1400354
8.6%
0 1374408
8.5%
Other values (11) 18
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 16206498
> 99.9%
Lowercase Letter 16
 
< 0.1%
Space Separator 1
 
< 0.1%
Uppercase Letter 1
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 2439480
15.1%
3 1778291
11.0%
4 1658961
10.2%
1 1632514
10.1%
5 1571567
9.7%
7 1538551
9.5%
8 1409644
8.7%
6 1402728
8.7%
9 1400354
8.6%
0 1374408
8.5%
Lowercase Letter
ValueCountFrequency (%)
a 4
25.0%
t 4
25.0%
u 2
12.5%
n 1
 
6.2%
p 1
 
6.2%
i 1
 
6.2%
r 1
 
6.2%
e 1
 
6.2%
c 1
 
6.2%
Space Separator
ValueCountFrequency (%)
1
100.0%
Uppercase Letter
ValueCountFrequency (%)
G 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 16206499
> 99.9%
Latin 17
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
2 2439480
15.1%
3 1778291
11.0%
4 1658961
10.2%
1 1632514
10.1%
5 1571567
9.7%
7 1538551
9.5%
8 1409644
8.7%
6 1402728
8.7%
9 1400354
8.6%
0 1374408
8.5%
Latin
ValueCountFrequency (%)
a 4
23.5%
t 4
23.5%
u 2
11.8%
n 1
 
5.9%
p 1
 
5.9%
G 1
 
5.9%
i 1
 
5.9%
r 1
 
5.9%
e 1
 
5.9%
c 1
 
5.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 16206516
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 2439480
15.1%
3 1778291
11.0%
4 1658961
10.2%
1 1632514
10.1%
5 1571567
9.7%
7 1538551
9.5%
8 1409644
8.7%
6 1402728
8.7%
9 1400354
8.6%
0 1374408
8.5%
Other values (11) 18
 
< 0.1%
Distinct8
Distinct (%)< 0.1%
Missing12
Missing (%)< 0.1%
Memory size18.0 MiB
2025-01-08T17:46:57.167972image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length24
Median length1
Mean length1.00000974
Min length1

Characters and Unicode

Total characters2361484
Distinct characters21
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row1
2nd row6
3rd row1
4th row1
5th row1
ValueCountFrequency (%)
1 1209386
51.2%
6 1054744
44.7%
5 56807
 
2.4%
4 20874
 
0.9%
3 13612
 
0.6%
0 5762
 
0.2%
7 275
 
< 0.1%
sphaeralcea 1
 
< 0.1%
palmeri 1
 
< 0.1%
rose 1
 
< 0.1%
2025-01-08T17:46:57.264162image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 1209386
51.2%
6 1054744
44.7%
5 56807
 
2.4%
4 20874
 
0.9%
3 13612
 
0.6%
0 5762
 
0.2%
7 275
 
< 0.1%
e 4
 
< 0.1%
a 4
 
< 0.1%
p 2
 
< 0.1%
Other values (11) 14
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2361460
> 99.9%
Lowercase Letter 20
 
< 0.1%
Space Separator 2
 
< 0.1%
Uppercase Letter 2
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 4
20.0%
a 4
20.0%
p 2
10.0%
r 2
10.0%
l 2
10.0%
h 1
 
5.0%
c 1
 
5.0%
m 1
 
5.0%
i 1
 
5.0%
o 1
 
5.0%
Decimal Number
ValueCountFrequency (%)
1 1209386
51.2%
6 1054744
44.7%
5 56807
 
2.4%
4 20874
 
0.9%
3 13612
 
0.6%
0 5762
 
0.2%
7 275
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
S 1
50.0%
R 1
50.0%
Space Separator
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2361462
> 99.9%
Latin 22
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 4
18.2%
a 4
18.2%
p 2
9.1%
r 2
9.1%
l 2
9.1%
h 1
 
4.5%
S 1
 
4.5%
c 1
 
4.5%
m 1
 
4.5%
i 1
 
4.5%
Other values (3) 3
13.6%
Common
ValueCountFrequency (%)
1 1209386
51.2%
6 1054744
44.7%
5 56807
 
2.4%
4 20874
 
0.9%
3 13612
 
0.6%
0 5762
 
0.2%
7 275
 
< 0.1%
2
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2361484
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 1209386
51.2%
6 1054744
44.7%
5 56807
 
2.4%
4 20874
 
0.9%
3 13612
 
0.6%
0 5762
 
0.2%
7 275
 
< 0.1%
e 4
 
< 0.1%
a 4
 
< 0.1%
p 2
 
< 0.1%
Other values (11) 14
 
< 0.1%
Distinct63
Distinct (%)< 0.1%
Missing7897
Missing (%)0.3%
Memory size18.0 MiB
2025-01-08T17:46:57.319065image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length2
Mean length4.119775185
Min length1

Characters and Unicode

Total characters9696204
Distinct characters18
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6 ?
Unique (%)< 0.1%

Sample

1st row54
2nd row7707728
3rd row44
4th row43
5th row42
ValueCountFrequency (%)
7707728 965311
41.0%
44 572771
24.3%
54 252406
 
10.7%
52 220179
 
9.4%
42 61416
 
2.6%
95 56083
 
2.4%
35 37922
 
1.6%
106 30954
 
1.3%
43 29998
 
1.3%
50 23220
 
1.0%
Other values (53) 103316
 
4.4%
2025-01-08T17:46:57.436709image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
7 3890684
40.1%
4 1510363
 
15.6%
2 1249598
 
12.9%
0 1044134
 
10.8%
8 1032331
 
10.6%
5 623579
 
6.4%
9 110301
 
1.1%
3 83881
 
0.9%
6 78759
 
0.8%
1 72563
 
0.7%
Other values (8) 11
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 9696193
> 99.9%
Uppercase Letter 11
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
7 3890684
40.1%
4 1510363
 
15.6%
2 1249598
 
12.9%
0 1044134
 
10.8%
8 1032331
 
10.6%
5 623579
 
6.4%
9 110301
 
1.1%
3 83881
 
0.9%
6 78759
 
0.8%
1 72563
 
0.7%
Uppercase Letter
ValueCountFrequency (%)
E 3
27.3%
C 2
18.2%
M 1
 
9.1%
L 1
 
9.1%
A 1
 
9.1%
P 1
 
9.1%
T 1
 
9.1%
D 1
 
9.1%

Most occurring scripts

ValueCountFrequency (%)
Common 9696193
> 99.9%
Latin 11
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
7 3890684
40.1%
4 1510363
 
15.6%
2 1249598
 
12.9%
0 1044134
 
10.8%
8 1032331
 
10.6%
5 623579
 
6.4%
9 110301
 
1.1%
3 83881
 
0.9%
6 78759
 
0.8%
1 72563
 
0.7%
Latin
ValueCountFrequency (%)
E 3
27.3%
C 2
18.2%
M 1
 
9.1%
L 1
 
9.1%
A 1
 
9.1%
P 1
 
9.1%
T 1
 
9.1%
D 1
 
9.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9696204
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
7 3890684
40.1%
4 1510363
 
15.6%
2 1249598
 
12.9%
0 1044134
 
10.8%
8 1032331
 
10.6%
5 623579
 
6.4%
9 110301
 
1.1%
3 83881
 
0.9%
6 78759
 
0.8%
1 72563
 
0.7%
Other values (8) 11
 
< 0.1%

classKey
Text

Missing 

Distinct185
Distinct (%)< 0.1%
Missing138564
Missing (%)5.9%
Memory size18.0 MiB
2025-01-08T17:46:57.562882image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length24
Median length3
Mean length3.356916995
Min length3

Characters and Unicode

Total characters7462121
Distinct characters15
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique19 ?
Unique (%)< 0.1%

Sample

1st row229
2nd row220
3rd row131
4th row206
5th row256
ValueCountFrequency (%)
220 657370
29.6%
196 231154
 
10.4%
225 155259
 
7.0%
359 152953
 
6.9%
216 149742
 
6.7%
212 149231
 
6.7%
131 100689
 
4.5%
229 76525
 
3.4%
7228684 63916
 
2.9%
256 53619
 
2.4%
Other values (175) 432451
19.5%
2025-01-08T17:46:57.732397image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 2662639
35.7%
1 1116752
15.0%
0 779965
 
10.5%
5 577305
 
7.7%
6 574737
 
7.7%
9 561562
 
7.5%
3 542610
 
7.3%
7 243304
 
3.3%
8 204616
 
2.7%
4 198624
 
2.7%
Other values (5) 7
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 7462114
> 99.9%
Other Punctuation 3
 
< 0.1%
Dash Punctuation 2
 
< 0.1%
Uppercase Letter 2
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 2662639
35.7%
1 1116752
15.0%
0 779965
 
10.5%
5 577305
 
7.7%
6 574737
 
7.7%
9 561562
 
7.5%
3 542610
 
7.3%
7 243304
 
3.3%
8 204616
 
2.7%
4 198624
 
2.7%
Other Punctuation
ValueCountFrequency (%)
: 2
66.7%
. 1
33.3%
Uppercase Letter
ValueCountFrequency (%)
T 1
50.0%
Z 1
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 7462119
> 99.9%
Latin 2
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
2 2662639
35.7%
1 1116752
15.0%
0 779965
 
10.5%
5 577305
 
7.7%
6 574737
 
7.7%
9 561562
 
7.5%
3 542610
 
7.3%
7 243304
 
3.3%
8 204616
 
2.7%
4 198624
 
2.7%
Other values (3) 5
 
< 0.1%
Latin
ValueCountFrequency (%)
T 1
50.0%
Z 1
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7462121
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 2662639
35.7%
1 1116752
15.0%
0 779965
 
10.5%
5 577305
 
7.7%
6 574737
 
7.7%
9 561562
 
7.5%
3 542610
 
7.3%
7 243304
 
3.3%
8 204616
 
2.7%
4 198624
 
2.7%
Other values (5) 7
 
< 0.1%

orderKey
Text

Missing 

Distinct932
Distinct (%)< 0.1%
Missing145723
Missing (%)6.2%
Memory size18.0 MiB
2025-01-08T17:46:57.904799image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length133
Median length3
Mean length3.806610403
Min length3

Characters and Unicode

Total characters8434497
Distinct characters50
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique81 ?
Unique (%)< 0.1%

Sample

1st row637
2nd row7225535
3rd row953
4th row714
5th row865
ValueCountFrequency (%)
1369 178531
 
8.1%
414 96944
 
4.4%
729 94751
 
4.3%
1459 75757
 
3.4%
408 67866
 
3.1%
1370 64632
 
2.9%
953 60565
 
2.7%
587 54527
 
2.5%
1414 53482
 
2.4%
637 49962
 
2.3%
Other values (947) 1418762
64.0%
2025-01-08T17:46:58.147242image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 1389199
16.5%
9 1119561
13.3%
4 1065109
12.6%
3 921277
10.9%
7 869118
10.3%
2 694377
8.2%
5 649591
7.7%
6 629528
7.5%
0 603865
7.2%
8 492425
 
5.8%
Other values (40) 447
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 8434050
> 99.9%
Lowercase Letter 347
 
< 0.1%
Uppercase Letter 37
 
< 0.1%
Other Punctuation 32
 
< 0.1%
Space Separator 29
 
< 0.1%
Dash Punctuation 2
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 50
14.4%
e 42
12.1%
i 37
10.7%
t 29
8.4%
o 28
 
8.1%
l 21
 
6.1%
r 21
 
6.1%
d 19
 
5.5%
s 15
 
4.3%
n 15
 
4.3%
Other values (11) 70
20.2%
Uppercase Letter
ValueCountFrequency (%)
A 11
29.7%
P 7
18.9%
C 3
 
8.1%
T 3
 
8.1%
M 2
 
5.4%
N 2
 
5.4%
D 2
 
5.4%
G 1
 
2.7%
O 1
 
2.7%
H 1
 
2.7%
Other values (4) 4
 
10.8%
Decimal Number
ValueCountFrequency (%)
1 1389199
16.5%
9 1119561
13.3%
4 1065109
12.6%
3 921277
10.9%
7 869118
10.3%
2 694377
8.2%
5 649591
7.7%
6 629528
7.5%
0 603865
7.2%
8 492425
 
5.8%
Other Punctuation
ValueCountFrequency (%)
, 29
90.6%
: 2
 
6.2%
. 1
 
3.1%
Space Separator
ValueCountFrequency (%)
29
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 8434113
> 99.9%
Latin 384
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 50
13.0%
e 42
 
10.9%
i 37
 
9.6%
t 29
 
7.6%
o 28
 
7.3%
l 21
 
5.5%
r 21
 
5.5%
d 19
 
4.9%
s 15
 
3.9%
n 15
 
3.9%
Other values (25) 107
27.9%
Common
ValueCountFrequency (%)
1 1389199
16.5%
9 1119561
13.3%
4 1065109
12.6%
3 921277
10.9%
7 869118
10.3%
2 694377
8.2%
5 649591
7.7%
6 629528
7.5%
0 603865
7.2%
8 492425
 
5.8%
Other values (5) 63
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8434497
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 1389199
16.5%
9 1119561
13.3%
4 1065109
12.6%
3 921277
10.9%
7 869118
10.3%
2 694377
8.2%
5 649591
7.7%
6 629528
7.5%
0 603865
7.2%
8 492425
 
5.8%
Other values (40) 447
 
< 0.1%

familyKey
Text

Missing 

Distinct6628
Distinct (%)0.3%
Missing52492
Missing (%)2.2%
Memory size18.0 MiB
2025-01-08T17:46:58.338290image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length36
Median length4
Mean length4.265161125
Min length4

Characters and Unicode

Total characters9848176
Distinct characters29
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique723 ?
Unique (%)< 0.1%

Sample

1st row3869
2nd row3112
3rd row6748
4th row2051
5th row4486
ValueCountFrequency (%)
3073 128004
 
5.5%
3065 91253
 
4.0%
5386 60425
 
2.6%
6748 56509
 
2.4%
7708 35190
 
1.5%
8798 30478
 
1.3%
3240723 27411
 
1.2%
5510 23714
 
1.0%
4334 20894
 
0.9%
6683 18664
 
0.8%
Other values (6618) 1816439
78.7%
2025-01-08T17:46:58.599662image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6 1376781
14.0%
3 1350810
13.7%
7 1111863
11.3%
5 1030804
10.5%
4 948921
9.6%
2 926218
9.4%
8 924745
9.4%
0 815403
8.3%
9 744345
7.6%
1 618216
6.3%
Other values (19) 70
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 9848106
> 99.9%
Lowercase Letter 60
 
< 0.1%
Uppercase Letter 6
 
< 0.1%
Dash Punctuation 4
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 15
25.0%
i 9
15.0%
n 5
 
8.3%
m 5
 
8.3%
l 5
 
8.3%
c 4
 
6.7%
t 3
 
5.0%
e 3
 
5.0%
b 3
 
5.0%
d 2
 
3.3%
Other values (5) 6
 
10.0%
Decimal Number
ValueCountFrequency (%)
6 1376781
14.0%
3 1350810
13.7%
7 1111863
11.3%
5 1030804
10.5%
4 948921
9.6%
2 926218
9.4%
8 924745
9.4%
0 815403
8.3%
9 744345
7.6%
1 618216
6.3%
Uppercase Letter
ValueCountFrequency (%)
A 4
66.7%
P 1
 
16.7%
C 1
 
16.7%
Dash Punctuation
ValueCountFrequency (%)
- 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 9848110
> 99.9%
Latin 66
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 15
22.7%
i 9
13.6%
n 5
 
7.6%
m 5
 
7.6%
l 5
 
7.6%
c 4
 
6.1%
A 4
 
6.1%
t 3
 
4.5%
e 3
 
4.5%
b 3
 
4.5%
Other values (8) 10
15.2%
Common
ValueCountFrequency (%)
6 1376781
14.0%
3 1350810
13.7%
7 1111863
11.3%
5 1030804
10.5%
4 948921
9.6%
2 926218
9.4%
8 924745
9.4%
0 815403
8.3%
9 744345
7.6%
1 618216
6.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9848176
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
6 1376781
14.0%
3 1350810
13.7%
7 1111863
11.3%
5 1030804
10.5%
4 948921
9.6%
2 926218
9.4%
8 924745
9.4%
0 815403
8.3%
9 744345
7.6%
1 618216
6.3%
Other values (19) 70
 
< 0.1%

genusKey
Text

Missing 

Distinct59199
Distinct (%)2.6%
Missing120649
Missing (%)5.1%
Memory size18.0 MiB
2025-01-08T17:46:58.807249image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length15
Median length7
Mean length7.014589276
Min length2

Characters and Unicode

Total characters15718460
Distinct characters33
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique16745 ?
Unique (%)0.7%

Sample

1st row3044392
2nd row2431198
3rd row2322781
4th row4798968
5th row4557352
ValueCountFrequency (%)
2431477 42953
 
1.9%
1340278 15824
 
0.7%
2721893 14686
 
0.7%
3188558 10093
 
0.5%
2437961 10025
 
0.4%
2431198 9258
 
0.4%
2607519 7917
 
0.4%
2704173 7658
 
0.3%
2713455 7007
 
0.3%
2705540 6575
 
0.3%
Other values (59189) 2108828
94.1%
2025-01-08T17:46:59.074028image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 2750236
17.5%
3 1857335
11.8%
4 1678697
10.7%
7 1472783
9.4%
1 1465143
9.3%
8 1402390
8.9%
9 1391416
8.9%
0 1293394
8.2%
6 1218552
7.8%
5 1188447
7.6%
Other values (23) 67
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 15718393
> 99.9%
Lowercase Letter 59
 
< 0.1%
Uppercase Letter 8
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 9
15.3%
t 7
11.9%
h 7
11.9%
e 6
10.2%
o 5
8.5%
y 4
 
6.8%
l 4
 
6.8%
m 3
 
5.1%
i 2
 
3.4%
n 2
 
3.4%
Other values (6) 10
16.9%
Decimal Number
ValueCountFrequency (%)
2 2750236
17.5%
3 1857335
11.8%
4 1678697
10.7%
7 1472783
9.4%
1 1465143
9.3%
8 1402390
8.9%
9 1391416
8.9%
0 1293394
8.2%
6 1218552
7.8%
5 1188447
7.6%
Uppercase Letter
ValueCountFrequency (%)
P 2
25.0%
U 1
12.5%
M 1
12.5%
S 1
12.5%
C 1
12.5%
T 1
12.5%
N 1
12.5%

Most occurring scripts

ValueCountFrequency (%)
Common 15718393
> 99.9%
Latin 67
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 9
13.4%
t 7
 
10.4%
h 7
 
10.4%
e 6
 
9.0%
o 5
 
7.5%
y 4
 
6.0%
l 4
 
6.0%
m 3
 
4.5%
P 2
 
3.0%
i 2
 
3.0%
Other values (13) 18
26.9%
Common
ValueCountFrequency (%)
2 2750236
17.5%
3 1857335
11.8%
4 1678697
10.7%
7 1472783
9.4%
1 1465143
9.3%
8 1402390
8.9%
9 1391416
8.9%
0 1293394
8.2%
6 1218552
7.8%
5 1188447
7.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 15718460
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 2750236
17.5%
3 1857335
11.8%
4 1678697
10.7%
7 1472783
9.4%
1 1465143
9.3%
8 1402390
8.9%
9 1391416
8.9%
0 1293394
8.2%
6 1218552
7.8%
5 1188447
7.6%
Other values (23) 67
 
< 0.1%

subgenusKey
Text

Missing 

Distinct7
Distinct (%)100.0%
Missing2361466
Missing (%)> 99.9%
Memory size18.0 MiB
2025-01-08T17:46:59.141571image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length24
Median length11
Mean length11.14285714
Min length2

Characters and Unicode

Total characters78
Distinct characters33
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7 ?
Unique (%)100.0%

Sample

1st rowChromadorea
2nd rowAconoidasida
3rd rowNE
4th rowCestoda
5th rowTrematoda
ValueCountFrequency (%)
chromadorea 1
14.3%
aconoidasida 1
14.3%
ne 1
14.3%
cestoda 1
14.3%
trematoda 1
14.3%
magnoliopsida 1
14.3%
2024-12-02t13:59:17.155z 1
14.3%
2025-01-08T17:46:59.246042image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 9
 
11.5%
o 8
 
10.3%
d 6
 
7.7%
1 4
 
5.1%
i 4
 
5.1%
2 4
 
5.1%
r 3
 
3.8%
5 3
 
3.8%
e 3
 
3.8%
s 3
 
3.8%
Other values (23) 31
39.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 47
60.3%
Decimal Number 17
 
21.8%
Uppercase Letter 9
 
11.5%
Other Punctuation 3
 
3.8%
Dash Punctuation 2
 
2.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 9
19.1%
o 8
17.0%
d 6
12.8%
i 4
8.5%
r 3
 
6.4%
e 3
 
6.4%
s 3
 
6.4%
t 2
 
4.3%
n 2
 
4.3%
m 2
 
4.3%
Other values (5) 5
10.6%
Decimal Number
ValueCountFrequency (%)
1 4
23.5%
2 4
23.5%
5 3
17.6%
0 2
11.8%
7 1
 
5.9%
9 1
 
5.9%
3 1
 
5.9%
4 1
 
5.9%
Uppercase Letter
ValueCountFrequency (%)
C 2
22.2%
T 2
22.2%
A 1
11.1%
M 1
11.1%
N 1
11.1%
E 1
11.1%
Z 1
11.1%
Other Punctuation
ValueCountFrequency (%)
: 2
66.7%
. 1
33.3%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 56
71.8%
Common 22
 
28.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 9
16.1%
o 8
14.3%
d 6
10.7%
i 4
 
7.1%
r 3
 
5.4%
e 3
 
5.4%
s 3
 
5.4%
C 2
 
3.6%
t 2
 
3.6%
T 2
 
3.6%
Other values (12) 14
25.0%
Common
ValueCountFrequency (%)
1 4
18.2%
2 4
18.2%
5 3
13.6%
: 2
9.1%
- 2
9.1%
0 2
9.1%
. 1
 
4.5%
7 1
 
4.5%
9 1
 
4.5%
3 1
 
4.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 78
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 9
 
11.5%
o 8
 
10.3%
d 6
 
7.7%
1 4
 
5.1%
i 4
 
5.1%
2 4
 
5.1%
r 3
 
3.8%
5 3
 
3.8%
e 3
 
3.8%
s 3
 
3.8%
Other values (23) 31
39.7%

speciesKey
Text

Missing 

Distinct271285
Distinct (%)13.2%
Missing306496
Missing (%)13.0%
Memory size18.0 MiB
2025-01-08T17:46:59.498578image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length43
Median length7
Mean length7.026590079
Min length5

Characters and Unicode

Total characters14439481
Distinct characters39
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique123606 ?
Unique (%)6.0%

Sample

1st row3044413
2nd row2431199
3rd row2322812
4th row10722387
5th row2429795
ValueCountFrequency (%)
2431491 19390
 
0.9%
2437967 4075
 
0.2%
2431539 3260
 
0.2%
2440447 2987
 
0.1%
2431224 2562
 
0.1%
2431506 2541
 
0.1%
2433176 2143
 
0.1%
2431516 2047
 
0.1%
2438019 1908
 
0.1%
2438655 1829
 
0.1%
Other values (271278) 2012238
97.9%
2025-01-08T17:46:59.939226image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 2212881
15.3%
3 1628160
11.3%
4 1530596
10.6%
1 1427667
9.9%
5 1423128
9.9%
7 1309681
9.1%
8 1277861
8.8%
9 1262298
8.7%
0 1219617
8.4%
6 1147471
7.9%
Other values (29) 121
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 14439360
> 99.9%
Lowercase Letter 104
 
< 0.1%
Uppercase Letter 10
 
< 0.1%
Other Punctuation 4
 
< 0.1%
Space Separator 3
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 18
17.3%
e 12
11.5%
i 10
9.6%
l 10
9.6%
o 9
8.7%
d 7
 
6.7%
s 6
 
5.8%
r 6
 
5.8%
t 5
 
4.8%
h 4
 
3.8%
Other values (9) 17
16.3%
Decimal Number
ValueCountFrequency (%)
2 2212881
15.3%
3 1628160
11.3%
4 1530596
10.6%
1 1427667
9.9%
5 1423128
9.9%
7 1309681
9.1%
8 1277861
8.8%
9 1262298
8.7%
0 1219617
8.4%
6 1147471
7.9%
Uppercase Letter
ValueCountFrequency (%)
P 3
30.0%
M 2
20.0%
A 1
 
10.0%
G 1
 
10.0%
H 1
 
10.0%
R 1
 
10.0%
D 1
 
10.0%
Other Punctuation
ValueCountFrequency (%)
, 3
75.0%
. 1
 
25.0%
Space Separator
ValueCountFrequency (%)
3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 14439367
> 99.9%
Latin 114
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 18
15.8%
e 12
10.5%
i 10
 
8.8%
l 10
 
8.8%
o 9
 
7.9%
d 7
 
6.1%
s 6
 
5.3%
r 6
 
5.3%
t 5
 
4.4%
h 4
 
3.5%
Other values (16) 27
23.7%
Common
ValueCountFrequency (%)
2 2212881
15.3%
3 1628160
11.3%
4 1530596
10.6%
1 1427667
9.9%
5 1423128
9.9%
7 1309681
9.1%
8 1277861
8.8%
9 1262298
8.7%
0 1219617
8.4%
6 1147471
7.9%
Other values (3) 7
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 14439481
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 2212881
15.3%
3 1628160
11.3%
4 1530596
10.6%
1 1427667
9.9%
5 1423128
9.9%
7 1309681
9.1%
8 1277861
8.8%
9 1262298
8.7%
0 1219617
8.4%
6 1147471
7.9%
Other values (29) 121
 
< 0.1%

species
Text

Missing 

Distinct270918
Distinct (%)13.2%
Missing306502
Missing (%)13.0%
Memory size18.0 MiB
2025-01-08T17:47:00.193839image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length41
Median length35
Mean length18.94208239
Min length4

Characters and Unicode

Total characters38925430
Distinct characters59
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique123378 ?
Unique (%)6.0%

Sample

1st rowPaysonia lescurii
2nd rowDesmognathus ochrophaeus
3rd rowNinoe kinbergi
4th rowHylogomphus adelphus
5th rowScaphiopus couchii
ValueCountFrequency (%)
plethodon 42272
 
1.0%
cinereus 21325
 
0.5%
carex 14397
 
0.4%
bombus 13087
 
0.3%
peromyscus 10009
 
0.2%
miconia 9511
 
0.2%
desmognathus 9016
 
0.2%
cladonia 7557
 
0.2%
poa 7484
 
0.2%
cyperus 6932
 
0.2%
Other values (144983) 3968538
96.6%
2025-01-08T17:47:00.512341image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 4398026
 
11.3%
i 3607850
 
9.3%
s 2760190
 
7.1%
e 2620738
 
6.7%
o 2514893
 
6.5%
r 2406574
 
6.2%
l 2172600
 
5.6%
u 2139863
 
5.5%
n 2075734
 
5.3%
2055157
 
5.3%
Other values (49) 12173805
31.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 34810486
89.4%
Space Separator 2055157
 
5.3%
Uppercase Letter 2055001
 
5.3%
Dash Punctuation 4780
 
< 0.1%
Decimal Number 3
 
< 0.1%
Math Symbol 1
 
< 0.1%
Other Punctuation 1
 
< 0.1%
Connector Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 4398026
12.6%
i 3607850
10.4%
s 2760190
 
7.9%
e 2620738
 
7.5%
o 2514893
 
7.2%
r 2406574
 
6.9%
l 2172600
 
6.2%
u 2139863
 
6.1%
n 2075734
 
6.0%
t 1885466
 
5.4%
Other values (16) 8228552
23.6%
Uppercase Letter
ValueCountFrequency (%)
P 312538
15.2%
C 268039
13.0%
S 190370
9.3%
A 188463
9.2%
M 140447
 
6.8%
E 112512
 
5.5%
L 109813
 
5.3%
D 94012
 
4.6%
T 93925
 
4.6%
B 83855
 
4.1%
Other values (16) 461027
22.4%
Decimal Number
ValueCountFrequency (%)
0 2
66.7%
5 1
33.3%
Space Separator
ValueCountFrequency (%)
2055157
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 4780
100.0%
Math Symbol
ValueCountFrequency (%)
× 1
100.0%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 36865487
94.7%
Common 2059943
 
5.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 4398026
11.9%
i 3607850
 
9.8%
s 2760190
 
7.5%
e 2620738
 
7.1%
o 2514893
 
6.8%
r 2406574
 
6.5%
l 2172600
 
5.9%
u 2139863
 
5.8%
n 2075734
 
5.6%
t 1885466
 
5.1%
Other values (42) 10283553
27.9%
Common
ValueCountFrequency (%)
2055157
99.8%
- 4780
 
0.2%
0 2
 
< 0.1%
× 1
 
< 0.1%
5 1
 
< 0.1%
. 1
 
< 0.1%
_ 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 38925429
> 99.9%
None 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 4398026
 
11.3%
i 3607850
 
9.3%
s 2760190
 
7.1%
e 2620738
 
6.7%
o 2514893
 
6.5%
r 2406574
 
6.2%
l 2172600
 
5.6%
u 2139863
 
5.5%
n 2075734
 
5.3%
2055157
 
5.3%
Other values (48) 12173804
31.3%
None
ValueCountFrequency (%)
× 1
100.0%
Distinct315019
Distinct (%)13.4%
Missing5767
Missing (%)0.2%
Memory size18.0 MiB
2025-01-08T17:47:00.772786image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length234
Median length129
Mean length32.19431075
Min length4

Characters and Unicode

Total characters75840331
Distinct characters134
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique142214 ?
Unique (%)6.0%

Sample

1st rowHippolytidae
2nd rowPaysonia lescurii (A.Gray) O'Kane & Al-Shehbaz
3rd rowDesmognathus ochrophaeus Cope, 1859
4th rowScleractinia
5th rowNinoe kinbergi Ehlers, 1887
ValueCountFrequency (%)
263619
 
2.9%
l 187761
 
2.0%
ex 84339
 
0.9%
linnaeus 82433
 
0.9%
1758 64115
 
0.7%
plethodon 42963
 
0.5%
var 34730
 
0.4%
1818 33708
 
0.4%
subsp 33211
 
0.4%
kunth 31136
 
0.3%
Other values (179005) 8332858
90.7%
2025-01-08T17:47:01.091873image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6835167
 
9.0%
a 6257797
 
8.3%
i 5019804
 
6.6%
e 4734764
 
6.2%
r 3964580
 
5.2%
s 3861386
 
5.1%
o 3664229
 
4.8%
n 3503274
 
4.6%
l 3422094
 
4.5%
u 2966100
 
3.9%
Other values (124) 31611136
41.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 53031876
69.9%
Space Separator 6835167
 
9.0%
Uppercase Letter 6180529
 
8.1%
Decimal Number 4384846
 
5.8%
Other Punctuation 3234472
 
4.3%
Open Punctuation 1070948
 
1.4%
Close Punctuation 1070948
 
1.4%
Dash Punctuation 28359
 
< 0.1%
Math Symbol 3161
 
< 0.1%
Connector Punctuation 25
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 6257797
11.8%
i 5019804
 
9.5%
e 4734764
 
8.9%
r 3964580
 
7.5%
s 3861386
 
7.3%
o 3664229
 
6.9%
n 3503274
 
6.6%
l 3422094
 
6.5%
u 2966100
 
5.6%
t 2761470
 
5.2%
Other values (61) 12876378
24.3%
Uppercase Letter
ValueCountFrequency (%)
L 588592
 
9.5%
S 572550
 
9.3%
C 544180
 
8.8%
P 518690
 
8.4%
A 420984
 
6.8%
M 419308
 
6.8%
B 409951
 
6.6%
H 341217
 
5.5%
G 326803
 
5.3%
D 278325
 
4.5%
Other values (33) 1759929
28.5%
Decimal Number
ValueCountFrequency (%)
1 1313369
30.0%
8 927854
21.2%
9 466939
 
10.6%
7 355077
 
8.1%
5 255118
 
5.8%
0 230866
 
5.3%
2 225110
 
5.1%
6 220162
 
5.0%
3 203545
 
4.6%
4 186806
 
4.3%
Other Punctuation
ValueCountFrequency (%)
. 1840636
56.9%
, 1124774
34.8%
& 263619
 
8.2%
' 5443
 
0.2%
Space Separator
ValueCountFrequency (%)
6835167
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1070948
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1070948
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 28359
100.0%
Math Symbol
ValueCountFrequency (%)
× 3161
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 25
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 59212405
78.1%
Common 16627926
 
21.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 6257797
 
10.6%
i 5019804
 
8.5%
e 4734764
 
8.0%
r 3964580
 
6.7%
s 3861386
 
6.5%
o 3664229
 
6.2%
n 3503274
 
5.9%
l 3422094
 
5.8%
u 2966100
 
5.0%
t 2761470
 
4.7%
Other values (104) 19056907
32.2%
Common
ValueCountFrequency (%)
6835167
41.1%
. 1840636
 
11.1%
1 1313369
 
7.9%
, 1124774
 
6.8%
( 1070948
 
6.4%
) 1070948
 
6.4%
8 927854
 
5.6%
9 466939
 
2.8%
7 355077
 
2.1%
& 263619
 
1.6%
Other values (10) 1358595
 
8.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 75704690
99.8%
None 135641
 
0.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
6835167
 
9.0%
a 6257797
 
8.3%
i 5019804
 
6.6%
e 4734764
 
6.3%
r 3964580
 
5.2%
s 3861386
 
5.1%
o 3664229
 
4.8%
n 3503274
 
4.6%
l 3422094
 
4.5%
u 2966100
 
3.9%
Other values (61) 31475495
41.6%
None
ValueCountFrequency (%)
ü 40827
30.1%
é 28282
20.9%
ö 18079
13.3%
è 11323
 
8.3%
á 5102
 
3.8%
ä 4987
 
3.7%
å 4932
 
3.6%
ø 4642
 
3.4%
× 3161
 
2.3%
Á 2128
 
1.6%
Other values (53) 12178
 
9.0%
Distinct389005
Distinct (%)17.2%
Missing94306
Missing (%)4.0%
Memory size18.0 MiB
2025-01-08T17:47:01.350062image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length125
Median length97
Mean length20.1873435
Min length3

Characters and Unicode

Total characters45768079
Distinct characters94
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique203004 ?
Unique (%)9.0%

Sample

1st rowLesquerella lescurii
2nd rowDesmognathus ochrophaeus
3rd rowNinoe kinbergi
4th rowGomphus adelphus
5th rowSkrjabinoclava catoptrophori
ValueCountFrequency (%)
sp 138550
 
2.8%
var 54090
 
1.1%
plethodon 42963
 
0.9%
subsp 26921
 
0.5%
cinereus 21966
 
0.4%
bombus 17610
 
0.4%
carex 14678
 
0.3%
indet 10551
 
0.2%
peromyscus 10026
 
0.2%
desmognathus 9258
 
0.2%
Other values (177028) 4602653
93.0%
2025-01-08T17:47:01.693617image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 5045324
 
11.0%
i 4145959
 
9.1%
s 3377285
 
7.4%
e 3010560
 
6.6%
o 2851325
 
6.2%
r 2816880
 
6.2%
2682099
 
5.9%
u 2473140
 
5.4%
l 2459207
 
5.4%
n 2395933
 
5.2%
Other values (84) 14510367
31.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 40393006
88.3%
Space Separator 2682099
 
5.9%
Uppercase Letter 2329247
 
5.1%
Other Punctuation 248057
 
0.5%
Open Punctuation 53941
 
0.1%
Close Punctuation 53940
 
0.1%
Dash Punctuation 5649
 
< 0.1%
Decimal Number 2054
 
< 0.1%
Connector Punctuation 78
 
< 0.1%
Math Symbol 8
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 5045324
12.5%
i 4145959
10.3%
s 3377285
 
8.4%
e 3010560
 
7.5%
o 2851325
 
7.1%
r 2816880
 
7.0%
u 2473140
 
6.1%
l 2459207
 
6.1%
n 2395933
 
5.9%
t 2138803
 
5.3%
Other values (27) 9678590
24.0%
Uppercase Letter
ValueCountFrequency (%)
P 347226
14.9%
C 308396
13.2%
A 214038
 
9.2%
S 207550
 
8.9%
M 156578
 
6.7%
L 125199
 
5.4%
E 119161
 
5.1%
T 112361
 
4.8%
D 105101
 
4.5%
B 97827
 
4.2%
Other values (18) 535810
23.0%
Decimal Number
ValueCountFrequency (%)
2 597
29.1%
1 490
23.9%
0 442
21.5%
5 259
12.6%
9 66
 
3.2%
8 53
 
2.6%
3 48
 
2.3%
7 42
 
2.0%
4 33
 
1.6%
6 24
 
1.2%
Other Punctuation
ValueCountFrequency (%)
. 242373
97.7%
" 2356
 
0.9%
, 1333
 
0.5%
' 1120
 
0.5%
& 607
 
0.2%
? 178
 
0.1%
/ 75
 
< 0.1%
# 14
 
< 0.1%
; 1
 
< 0.1%
Math Symbol
ValueCountFrequency (%)
+ 3
37.5%
× 3
37.5%
~ 2
25.0%
Open Punctuation
ValueCountFrequency (%)
( 53920
> 99.9%
[ 21
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
) 53919
> 99.9%
] 21
 
< 0.1%
Space Separator
ValueCountFrequency (%)
2682099
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 5649
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 78
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 42722253
93.3%
Common 3045826
 
6.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 5045324
11.8%
i 4145959
 
9.7%
s 3377285
 
7.9%
e 3010560
 
7.0%
o 2851325
 
6.7%
r 2816880
 
6.6%
u 2473140
 
5.8%
l 2459207
 
5.8%
n 2395933
 
5.6%
t 2138803
 
5.0%
Other values (55) 12007837
28.1%
Common
ValueCountFrequency (%)
2682099
88.1%
. 242373
 
8.0%
( 53920
 
1.8%
) 53919
 
1.8%
- 5649
 
0.2%
" 2356
 
0.1%
, 1333
 
< 0.1%
' 1120
 
< 0.1%
& 607
 
< 0.1%
2 597
 
< 0.1%
Other values (19) 1853
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 45767767
> 99.9%
None 312
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 5045324
 
11.0%
i 4145959
 
9.1%
s 3377285
 
7.4%
e 3010560
 
6.6%
o 2851325
 
6.2%
r 2816880
 
6.2%
2682099
 
5.9%
u 2473140
 
5.4%
l 2459207
 
5.4%
n 2395933
 
5.2%
Other values (70) 14510055
31.7%
None
ValueCountFrequency (%)
ë 184
59.0%
ö 38
 
12.2%
ü 28
 
9.0%
á 20
 
6.4%
Á 16
 
5.1%
é 11
 
3.5%
ó 4
 
1.3%
× 3
 
1.0%
É 2
 
0.6%
ñ 2
 
0.6%
Other values (4) 4
 
1.3%

typifiedName
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing2361471
Missing (%)> 99.9%
Memory size18.0 MiB
2025-01-08T17:47:01.748187image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length10.5
Mean length10.5
Min length8

Characters and Unicode

Total characters21
Distinct characters15
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st rowFrench Guiana
2nd rowMalvales
ValueCountFrequency (%)
french 1
33.3%
guiana 1
33.3%
malvales 1
33.3%
2025-01-08T17:47:01.841239image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 4
19.0%
e 2
 
9.5%
n 2
 
9.5%
l 2
 
9.5%
F 1
 
4.8%
r 1
 
4.8%
c 1
 
4.8%
h 1
 
4.8%
1
 
4.8%
G 1
 
4.8%
Other values (5) 5
23.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 17
81.0%
Uppercase Letter 3
 
14.3%
Space Separator 1
 
4.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 4
23.5%
e 2
11.8%
n 2
11.8%
l 2
11.8%
r 1
 
5.9%
c 1
 
5.9%
h 1
 
5.9%
u 1
 
5.9%
i 1
 
5.9%
v 1
 
5.9%
Uppercase Letter
ValueCountFrequency (%)
F 1
33.3%
G 1
33.3%
M 1
33.3%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 20
95.2%
Common 1
 
4.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 4
20.0%
e 2
10.0%
n 2
10.0%
l 2
10.0%
F 1
 
5.0%
r 1
 
5.0%
c 1
 
5.0%
h 1
 
5.0%
G 1
 
5.0%
u 1
 
5.0%
Other values (4) 4
20.0%
Common
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 21
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 4
19.0%
e 2
 
9.5%
n 2
 
9.5%
l 2
 
9.5%
F 1
 
4.8%
r 1
 
4.8%
c 1
 
4.8%
h 1
 
4.8%
1
 
4.8%
G 1
 
4.8%
Other values (5) 5
23.8%
Distinct3
Distinct (%)< 0.1%
Missing11
Missing (%)< 0.1%
Memory size18.0 MiB
2025-01-08T17:47:01.890240image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length48
Median length3
Mean length3.00002075
Min length3

Characters and Unicode

Total characters7084435
Distinct characters19
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st rowEML
2nd rowEML
3rd rowEML
4th rowEML
5th rowEML
ValueCountFrequency (%)
eml 2361460
> 99.9%
guf.1_1 1
 
< 0.1%
occurrence_status_inferred_from_individual_count 1
 
< 0.1%
2025-01-08T17:47:01.983622image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
E 2361464
33.3%
L 2361461
33.3%
M 2361461
33.3%
_ 6
 
< 0.1%
R 5
 
< 0.1%
U 5
 
< 0.1%
I 4
 
< 0.1%
N 4
 
< 0.1%
C 4
 
< 0.1%
D 3
 
< 0.1%
Other values (9) 18
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 7084426
> 99.9%
Connector Punctuation 6
 
< 0.1%
Decimal Number 2
 
< 0.1%
Other Punctuation 1
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 2361464
33.3%
L 2361461
33.3%
M 2361461
33.3%
R 5
 
< 0.1%
U 5
 
< 0.1%
I 4
 
< 0.1%
N 4
 
< 0.1%
C 4
 
< 0.1%
D 3
 
< 0.1%
T 3
 
< 0.1%
Other values (6) 12
 
< 0.1%
Connector Punctuation
ValueCountFrequency (%)
_ 6
100.0%
Decimal Number
ValueCountFrequency (%)
1 2
100.0%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 7084426
> 99.9%
Common 9
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 2361464
33.3%
L 2361461
33.3%
M 2361461
33.3%
R 5
 
< 0.1%
U 5
 
< 0.1%
I 4
 
< 0.1%
N 4
 
< 0.1%
C 4
 
< 0.1%
D 3
 
< 0.1%
T 3
 
< 0.1%
Other values (6) 12
 
< 0.1%
Common
ValueCountFrequency (%)
_ 6
66.7%
1 2
 
22.2%
. 1
 
11.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7084435
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E 2361464
33.3%
L 2361461
33.3%
M 2361461
33.3%
_ 6
 
< 0.1%
R 5
 
< 0.1%
U 5
 
< 0.1%
I 4
 
< 0.1%
N 4
 
< 0.1%
C 4
 
< 0.1%
D 3
 
< 0.1%
Other values (9) 18
 
< 0.1%
Distinct210769
Distinct (%)8.9%
Missing4
Missing (%)< 0.1%
Memory size18.0 MiB
2025-01-08T17:47:02.130613image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length24
Median length24
Mean length23.99603679
Min length7

Characters and Unicode

Total characters56665897
Distinct characters42
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7665 ?
Unique (%)0.3%

Sample

1st row2024-12-02T13:59:36.683Z
2nd row2024-12-02T13:59:14.817Z
3rd row2024-12-02T13:57:42.802Z
4th row2024-12-02T13:59:13.837Z
5th row2024-12-02T13:57:45.358Z
ValueCountFrequency (%)
2024-12-02t13:57:25.039z 46
 
< 0.1%
2024-12-02t13:57:24.083z 45
 
< 0.1%
2024-12-02t13:57:45.003z 45
 
< 0.1%
2024-12-02t13:57:28.833z 45
 
< 0.1%
2024-12-02t13:57:34.491z 44
 
< 0.1%
2024-12-02t13:57:52.915z 44
 
< 0.1%
2024-12-02t13:57:52.924z 43
 
< 0.1%
2024-12-02t13:57:43.166z 43
 
< 0.1%
2024-12-02t13:57:52.893z 42
 
< 0.1%
2024-12-02t13:57:42.743z 42
 
< 0.1%
Other values (210759) 2361030
> 99.9%
2025-01-08T17:47:02.340000image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 10789973
19.0%
0 5989862
10.6%
1 5962965
10.5%
: 4722920
8.3%
- 4722920
8.3%
4 3794952
 
6.7%
5 3740547
 
6.6%
3 3738231
 
6.6%
T 2361460
 
4.2%
Z 2361460
 
4.2%
Other values (32) 8480607
15.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 40137896
70.8%
Other Punctuation 7082072
 
12.5%
Uppercase Letter 4722930
 
8.3%
Dash Punctuation 4722920
 
8.3%
Lowercase Letter 79
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 10
12.7%
e 10
12.7%
i 9
11.4%
n 7
8.9%
c 6
7.6%
u 5
 
6.3%
l 5
 
6.3%
r 5
 
6.3%
t 4
 
5.1%
m 4
 
5.1%
Other values (9) 14
17.7%
Decimal Number
ValueCountFrequency (%)
2 10789973
26.9%
0 5989862
14.9%
1 5962965
14.9%
4 3794952
 
9.5%
5 3740547
 
9.3%
3 3738231
 
9.3%
7 1827732
 
4.6%
9 1516897
 
3.8%
6 1417013
 
3.5%
8 1359724
 
3.4%
Uppercase Letter
ValueCountFrequency (%)
T 2361460
50.0%
Z 2361460
50.0%
M 2
 
< 0.1%
P 2
 
< 0.1%
H 1
 
< 0.1%
U 1
 
< 0.1%
C 1
 
< 0.1%
S 1
 
< 0.1%
L 1
 
< 0.1%
I 1
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
: 4722920
66.7%
. 2359152
33.3%
Dash Punctuation
ValueCountFrequency (%)
- 4722920
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 51942888
91.7%
Latin 4723009
 
8.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
T 2361460
50.0%
Z 2361460
50.0%
a 10
 
< 0.1%
e 10
 
< 0.1%
i 9
 
< 0.1%
n 7
 
< 0.1%
c 6
 
< 0.1%
u 5
 
< 0.1%
l 5
 
< 0.1%
r 5
 
< 0.1%
Other values (19) 32
 
< 0.1%
Common
ValueCountFrequency (%)
2 10789973
20.8%
0 5989862
11.5%
1 5962965
11.5%
: 4722920
9.1%
- 4722920
9.1%
4 3794952
 
7.3%
5 3740547
 
7.2%
3 3738231
 
7.2%
. 2359152
 
4.5%
7 1827732
 
3.5%
Other values (3) 4293634
 
8.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 56665897
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 10789973
19.0%
0 5989862
10.6%
1 5962965
10.5%
: 4722920
8.3%
- 4722920
8.3%
4 3794952
 
6.7%
5 3740547
 
6.6%
3 3738231
 
6.6%
T 2361460
 
4.2%
Z 2361460
 
4.2%
Other values (32) 8480607
15.0%
Distinct9
Distinct (%)< 0.1%
Missing5
Missing (%)< 0.1%
Memory size18.0 MiB
2025-01-08T17:47:02.405026image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length24
Median length24
Mean length23.99995045
Min length5

Characters and Unicode

Total characters56675115
Distinct characters38
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8 ?
Unique (%)< 0.1%

Sample

1st row2024-12-02T11:48:23.416Z
2nd row2024-12-02T11:48:23.416Z
3rd row2024-12-02T11:48:23.416Z
4th row2024-12-02T11:48:23.416Z
5th row2024-12-02T11:48:23.416Z
ValueCountFrequency (%)
2024-12-02t11:48:23.416z 2361460
> 99.9%
uncinaria 1
 
< 0.1%
haemoproteus 1
 
< 0.1%
phyllobothrium 1
 
< 0.1%
guf.1.11_1 1
 
< 0.1%
distomum 1
 
< 0.1%
senecio 1
 
< 0.1%
false 1
 
< 0.1%
merluccius 1
 
< 0.1%
2025-01-08T17:47:02.526473image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 11807300
20.8%
1 9445844
16.7%
4 7084380
12.5%
- 4722920
 
8.3%
: 4722920
 
8.3%
0 4722920
 
8.3%
. 2361462
 
4.2%
T 2361460
 
4.2%
8 2361460
 
4.2%
3 2361460
 
4.2%
Other values (28) 4722989
8.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 40144824
70.8%
Other Punctuation 7084382
 
12.5%
Uppercase Letter 4722929
 
8.3%
Dash Punctuation 4722920
 
8.3%
Lowercase Letter 59
 
< 0.1%
Connector Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 6
10.2%
o 6
10.2%
e 6
10.2%
u 5
8.5%
c 4
 
6.8%
r 4
 
6.8%
m 4
 
6.8%
s 4
 
6.8%
a 4
 
6.8%
l 4
 
6.8%
Other values (7) 12
20.3%
Uppercase Letter
ValueCountFrequency (%)
T 2361460
50.0%
Z 2361460
50.0%
U 2
 
< 0.1%
F 1
 
< 0.1%
S 1
 
< 0.1%
D 1
 
< 0.1%
P 1
 
< 0.1%
G 1
 
< 0.1%
H 1
 
< 0.1%
M 1
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
2 11807300
29.4%
1 9445844
23.5%
4 7084380
17.6%
0 4722920
 
11.8%
8 2361460
 
5.9%
3 2361460
 
5.9%
6 2361460
 
5.9%
Other Punctuation
ValueCountFrequency (%)
: 4722920
66.7%
. 2361462
33.3%
Dash Punctuation
ValueCountFrequency (%)
- 4722920
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 51952127
91.7%
Latin 4722988
 
8.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
T 2361460
50.0%
Z 2361460
50.0%
i 6
 
< 0.1%
o 6
 
< 0.1%
e 6
 
< 0.1%
u 5
 
< 0.1%
c 4
 
< 0.1%
r 4
 
< 0.1%
m 4
 
< 0.1%
s 4
 
< 0.1%
Other values (17) 29
 
< 0.1%
Common
ValueCountFrequency (%)
2 11807300
22.7%
1 9445844
18.2%
4 7084380
13.6%
- 4722920
 
9.1%
: 4722920
 
9.1%
0 4722920
 
9.1%
. 2361462
 
4.5%
8 2361460
 
4.5%
3 2361460
 
4.5%
6 2361460
 
4.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 56675115
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 11807300
20.8%
1 9445844
16.7%
4 7084380
12.5%
- 4722920
 
8.3%
: 4722920
 
8.3%
0 4722920
 
8.3%
. 2361462
 
4.2%
T 2361460
 
4.2%
8 2361460
 
4.2%
3 2361460
 
4.2%
Other values (28) 4722989
8.3%

repatriated
Text

Missing 

Distinct3
Distinct (%)< 0.1%
Missing92313
Missing (%)3.9%
Memory size18.0 MiB
2025-01-08T17:47:02.569122image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length4
Mean length4.372713251
Min length4

Characters and Unicode

Total characters9922386
Distinct characters13
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowtrue
2nd rowfalse
3rd rowfalse
4th rowfalse
5th rowfalse
ValueCountFrequency (%)
true 1423419
62.7%
false 845740
37.3%
saint-elie 1
 
< 0.1%
2025-01-08T17:47:02.665090image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 2269160
22.9%
t 1423420
14.3%
r 1423419
14.3%
u 1423419
14.3%
a 845741
 
8.5%
l 845741
 
8.5%
f 845740
 
8.5%
s 845740
 
8.5%
i 2
 
< 0.1%
S 1
 
< 0.1%
Other values (3) 3
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 9922383
> 99.9%
Uppercase Letter 2
 
< 0.1%
Dash Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 2269160
22.9%
t 1423420
14.3%
r 1423419
14.3%
u 1423419
14.3%
a 845741
 
8.5%
l 845741
 
8.5%
f 845740
 
8.5%
s 845740
 
8.5%
i 2
 
< 0.1%
n 1
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
S 1
50.0%
E 1
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 9922385
> 99.9%
Common 1
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 2269160
22.9%
t 1423420
14.3%
r 1423419
14.3%
u 1423419
14.3%
a 845741
 
8.5%
l 845741
 
8.5%
f 845740
 
8.5%
s 845740
 
8.5%
i 2
 
< 0.1%
S 1
 
< 0.1%
Other values (2) 2
 
< 0.1%
Common
ValueCountFrequency (%)
- 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9922386
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 2269160
22.9%
t 1423420
14.3%
r 1423419
14.3%
u 1423419
14.3%
a 845741
 
8.5%
l 845741
 
8.5%
f 845740
 
8.5%
s 845740
 
8.5%
i 2
 
< 0.1%
S 1
 
< 0.1%
Other values (3) 3
 
< 0.1%

relativeOrganismQuantity
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing2361472
Missing (%)> 99.9%
Memory size18.0 MiB
2025-01-08T17:47:02.703878image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters7
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row3034046
ValueCountFrequency (%)
3034046 1
100.0%
2025-01-08T17:47:02.793118image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3 2
28.6%
0 2
28.6%
4 2
28.6%
6 1
14.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 7
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 2
28.6%
0 2
28.6%
4 2
28.6%
6 1
14.3%

Most occurring scripts

ValueCountFrequency (%)
Common 7
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
3 2
28.6%
0 2
28.6%
4 2
28.6%
6 1
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3 2
28.6%
0 2
28.6%
4 2
28.6%
6 1
14.3%

projectId
Text

Missing 

Distinct6
Distinct (%)100.0%
Missing2361467
Missing (%)> 99.9%
Memory size18.0 MiB
2025-01-08T17:47:02.845349image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length11
Median length9
Mean length8
Min length5

Characters and Unicode

Total characters48
Distinct characters22
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6 ?
Unique (%)100.0%

Sample

1st rowcaudatum
2nd rowvibex
3rd rowSphaeralcea
4th rowblumeri
5th row3034046
ValueCountFrequency (%)
caudatum 1
16.7%
vibex 1
16.7%
sphaeralcea 1
16.7%
blumeri 1
16.7%
3034046 1
16.7%
bilinearis 1
16.7%
2025-01-08T17:47:02.951814image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 6
12.5%
i 5
 
10.4%
e 5
 
10.4%
r 3
 
6.2%
u 3
 
6.2%
b 3
 
6.2%
l 3
 
6.2%
c 2
 
4.2%
m 2
 
4.2%
4 2
 
4.2%
Other values (12) 14
29.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 40
83.3%
Decimal Number 7
 
14.6%
Uppercase Letter 1
 
2.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 6
15.0%
i 5
12.5%
e 5
12.5%
r 3
7.5%
u 3
7.5%
b 3
7.5%
l 3
7.5%
c 2
 
5.0%
m 2
 
5.0%
n 1
 
2.5%
Other values (7) 7
17.5%
Decimal Number
ValueCountFrequency (%)
4 2
28.6%
0 2
28.6%
3 2
28.6%
6 1
14.3%
Uppercase Letter
ValueCountFrequency (%)
S 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 41
85.4%
Common 7
 
14.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 6
14.6%
i 5
12.2%
e 5
12.2%
r 3
 
7.3%
u 3
 
7.3%
b 3
 
7.3%
l 3
 
7.3%
c 2
 
4.9%
m 2
 
4.9%
n 1
 
2.4%
Other values (8) 8
19.5%
Common
ValueCountFrequency (%)
4 2
28.6%
0 2
28.6%
3 2
28.6%
6 1
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 48
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 6
12.5%
i 5
 
10.4%
e 5
 
10.4%
r 3
 
6.2%
u 3
 
6.2%
b 3
 
6.2%
l 3
 
6.2%
c 2
 
4.2%
m 2
 
4.2%
4 2
 
4.2%
Other values (12) 14
29.2%
Distinct5
Distinct (%)< 0.1%
Missing10
Missing (%)< 0.1%
Memory size18.0 MiB
2025-01-08T17:47:02.997814image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length11
Median length5
Mean length4.998686408
Min length1

Characters and Unicode

Total characters11804213
Distinct characters15
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)< 0.1%

Sample

1st rowfalse
2nd rowfalse
3rd rowfalse
4th rowfalse
5th rowfalse
ValueCountFrequency (%)
false 2358359
99.9%
true 3101
 
0.1%
lc 1
 
< 0.1%
sphaeralcea 1
 
< 0.1%
6 1
 
< 0.1%
2025-01-08T17:47:03.090440image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 2361462
20.0%
a 2358362
20.0%
l 2358360
20.0%
f 2358359
20.0%
s 2358359
20.0%
r 3102
 
< 0.1%
t 3101
 
< 0.1%
u 3101
 
< 0.1%
L 1
 
< 0.1%
C 1
 
< 0.1%
Other values (5) 5
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 11804209
> 99.9%
Uppercase Letter 3
 
< 0.1%
Decimal Number 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 2361462
20.0%
a 2358362
20.0%
l 2358360
20.0%
f 2358359
20.0%
s 2358359
20.0%
r 3102
 
< 0.1%
t 3101
 
< 0.1%
u 3101
 
< 0.1%
p 1
 
< 0.1%
h 1
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
L 1
33.3%
C 1
33.3%
S 1
33.3%
Decimal Number
ValueCountFrequency (%)
6 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 11804212
> 99.9%
Common 1
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 2361462
20.0%
a 2358362
20.0%
l 2358360
20.0%
f 2358359
20.0%
s 2358359
20.0%
r 3102
 
< 0.1%
t 3101
 
< 0.1%
u 3101
 
< 0.1%
L 1
 
< 0.1%
C 1
 
< 0.1%
Other values (4) 4
 
< 0.1%
Common
ValueCountFrequency (%)
6 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 11804213
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 2361462
20.0%
a 2358362
20.0%
l 2358360
20.0%
f 2358359
20.0%
s 2358359
20.0%
r 3102
 
< 0.1%
t 3101
 
< 0.1%
u 3101
 
< 0.1%
L 1
 
< 0.1%
C 1
 
< 0.1%
Other values (5) 5
 
< 0.1%

gbifRegion
Text

Missing 

Distinct8
Distinct (%)< 0.1%
Missing114374
Missing (%)4.8%
Memory size18.0 MiB
2025-01-08T17:47:03.140713image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length13
Mean length10.97984156
Min length4

Characters and Unicode

Total characters24672791
Distinct characters20
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowLATIN_AMERICA
2nd rowNORTH_AMERICA
3rd rowNORTH_AMERICA
4th rowNORTH_AMERICA
5th rowNORTH_AMERICA
ValueCountFrequency (%)
north_america 899421
40.0%
latin_america 745190
33.2%
asia 257467
 
11.5%
oceania 127573
 
5.7%
africa 108539
 
4.8%
europe 92588
 
4.1%
antarctica 16320
 
0.7%
7707728 1
 
< 0.1%
2025-01-08T17:47:03.242787image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 5070530
20.6%
I 2899700
11.8%
R 2761479
11.2%
E 1957360
 
7.9%
C 1913363
 
7.8%
N 1788504
 
7.2%
T 1677251
 
6.8%
_ 1644611
 
6.7%
M 1644611
 
6.7%
O 1119582
 
4.5%
Other values (10) 2195800
8.9%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 23028173
93.3%
Connector Punctuation 1644611
 
6.7%
Decimal Number 7
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 5070530
22.0%
I 2899700
12.6%
R 2761479
12.0%
E 1957360
 
8.5%
C 1913363
 
8.3%
N 1788504
 
7.8%
T 1677251
 
7.3%
M 1644611
 
7.1%
O 1119582
 
4.9%
H 899421
 
3.9%
Other values (5) 1296372
 
5.6%
Decimal Number
ValueCountFrequency (%)
7 4
57.1%
0 1
 
14.3%
2 1
 
14.3%
8 1
 
14.3%
Connector Punctuation
ValueCountFrequency (%)
_ 1644611
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 23028173
93.3%
Common 1644618
 
6.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 5070530
22.0%
I 2899700
12.6%
R 2761479
12.0%
E 1957360
 
8.5%
C 1913363
 
8.3%
N 1788504
 
7.8%
T 1677251
 
7.3%
M 1644611
 
7.1%
O 1119582
 
4.9%
H 899421
 
3.9%
Other values (5) 1296372
 
5.6%
Common
ValueCountFrequency (%)
_ 1644611
> 99.9%
7 4
 
< 0.1%
0 1
 
< 0.1%
2 1
 
< 0.1%
8 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 24672791
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 5070530
20.6%
I 2899700
11.8%
R 2761479
11.2%
E 1957360
 
7.9%
C 1913363
 
7.8%
N 1788504
 
7.2%
T 1677251
 
6.8%
_ 1644611
 
6.7%
M 1644611
 
6.7%
O 1119582
 
4.5%
Other values (10) 2195800
8.9%
Distinct4
Distinct (%)< 0.1%
Missing6
Missing (%)< 0.1%
Memory size18.0 MiB
2025-01-08T17:47:03.288786image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length13
Mean length12.99997883
Min length3

Characters and Unicode

Total characters30699021
Distinct characters17
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowNORTH_AMERICA
2nd rowNORTH_AMERICA
3rd rowNORTH_AMERICA
4th rowNORTH_AMERICA
5th rowNORTH_AMERICA
ValueCountFrequency (%)
north_america 2361460
> 99.9%
species 4
 
< 0.1%
genus 2
 
< 0.1%
220 1
 
< 0.1%
2025-01-08T17:47:03.385855image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 4722920
15.4%
R 4722920
15.4%
E 2361470
7.7%
C 2361464
7.7%
I 2361464
7.7%
N 2361462
7.7%
_ 2361460
7.7%
M 2361460
7.7%
O 2361460
7.7%
H 2361460
7.7%
Other values (7) 2361481
7.7%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 28337558
92.3%
Connector Punctuation 2361460
 
7.7%
Decimal Number 3
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 4722920
16.7%
R 4722920
16.7%
E 2361470
8.3%
C 2361464
8.3%
I 2361464
8.3%
N 2361462
8.3%
M 2361460
8.3%
O 2361460
8.3%
H 2361460
8.3%
T 2361460
8.3%
Other values (4) 18
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
2 2
66.7%
0 1
33.3%
Connector Punctuation
ValueCountFrequency (%)
_ 2361460
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 28337558
92.3%
Common 2361463
 
7.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 4722920
16.7%
R 4722920
16.7%
E 2361470
8.3%
C 2361464
8.3%
I 2361464
8.3%
N 2361462
8.3%
M 2361460
8.3%
O 2361460
8.3%
H 2361460
8.3%
T 2361460
8.3%
Other values (4) 18
 
< 0.1%
Common
ValueCountFrequency (%)
_ 2361460
> 99.9%
2 2
 
< 0.1%
0 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 30699021
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 4722920
15.4%
R 4722920
15.4%
E 2361470
7.7%
C 2361464
7.7%
I 2361464
7.7%
N 2361462
7.7%
_ 2361460
7.7%
M 2361460
7.7%
O 2361460
7.7%
H 2361460
7.7%
Other values (7) 2361481
7.7%

level0Gid
Text

Missing 

Distinct239
Distinct (%)0.1%
Missing1911133
Missing (%)80.9%
Memory size18.0 MiB
2025-01-08T17:47:03.545995image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length3
Mean length3.000008882
Min length3

Characters and Unicode

Total characters1351024
Distinct characters38
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6 ?
Unique (%)< 0.1%

Sample

1st rowUSA
2nd rowUSA
3rd rowUSA
4th rowUSA
5th rowCRI
ValueCountFrequency (%)
usa 191983
42.6%
ven 20119
 
4.5%
bra 20003
 
4.4%
guy 18195
 
4.0%
mex 16475
 
3.7%
ecu 12944
 
2.9%
per 9844
 
2.2%
can 9240
 
2.1%
pan 6074
 
1.3%
bol 6000
 
1.3%
Other values (229) 139463
31.0%
2025-01-08T17:47:03.772258image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 260823
19.3%
U 244099
18.1%
S 209561
15.5%
E 69205
 
5.1%
N 67199
 
5.0%
R 56601
 
4.2%
C 47627
 
3.5%
G 44616
 
3.3%
M 43317
 
3.2%
B 38463
 
2.8%
Other values (28) 269513
19.9%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1350250
99.9%
Decimal Number 767
 
0.1%
Lowercase Letter 7
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 260823
19.3%
U 244099
18.1%
S 209561
15.5%
E 69205
 
5.1%
N 67199
 
5.0%
R 56601
 
4.2%
C 47627
 
3.5%
G 44616
 
3.3%
M 43317
 
3.2%
B 38463
 
2.8%
Other values (16) 268739
19.9%
Lowercase Letter
ValueCountFrequency (%)
p 1
14.3%
a 1
14.3%
l 1
14.3%
m 1
14.3%
e 1
14.3%
r 1
14.3%
i 1
14.3%
Decimal Number
ValueCountFrequency (%)
0 383
49.9%
1 299
39.0%
6 75
 
9.8%
7 9
 
1.2%
4 1
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 1350257
99.9%
Common 767
 
0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 260823
19.3%
U 244099
18.1%
S 209561
15.5%
E 69205
 
5.1%
N 67199
 
5.0%
R 56601
 
4.2%
C 47627
 
3.5%
G 44616
 
3.3%
M 43317
 
3.2%
B 38463
 
2.8%
Other values (23) 268746
19.9%
Common
ValueCountFrequency (%)
0 383
49.9%
1 299
39.0%
6 75
 
9.8%
7 9
 
1.2%
4 1
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1351024
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 260823
19.3%
U 244099
18.1%
S 209561
15.5%
E 69205
 
5.1%
N 67199
 
5.0%
R 56601
 
4.2%
C 47627
 
3.5%
G 44616
 
3.3%
M 43317
 
3.2%
B 38463
 
2.8%
Other values (28) 269513
19.9%

level0Name
Text

Missing 

Distinct238
Distinct (%)0.1%
Missing1911134
Missing (%)80.9%
Memory size18.0 MiB
2025-01-08T17:47:03.947429image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length32
Median length30
Mean length10.17422653
Min length4

Characters and Unicode

Total characters4581851
Distinct characters65
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5 ?
Unique (%)< 0.1%

Sample

1st rowUnited States
2nd rowUnited States
3rd rowUnited States
4th rowUnited States
5th rowCosta Rica
ValueCountFrequency (%)
united 193094
27.7%
states 192247
27.6%
venezuela 20119
 
2.9%
brazil 20003
 
2.9%
guyana 18195
 
2.6%
méxico 16475
 
2.4%
ecuador 12944
 
1.9%
peru 9844
 
1.4%
canada 9240
 
1.3%
french 6658
 
1.0%
Other values (277) 197267
28.3%
2025-01-08T17:47:04.187050image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
t 615456
13.4%
a 532048
11.6%
e 528246
11.5%
i 371986
 
8.1%
n 350657
 
7.7%
245747
 
5.4%
d 242883
 
5.3%
s 236258
 
5.2%
S 209349
 
4.6%
U 194325
 
4.2%
Other values (55) 1054896
23.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3641450
79.5%
Uppercase Letter 692317
 
15.1%
Space Separator 245747
 
5.4%
Other Punctuation 2266
 
< 0.1%
Open Punctuation 23
 
< 0.1%
Close Punctuation 23
 
< 0.1%
Dash Punctuation 21
 
< 0.1%
Decimal Number 4
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 615456
16.9%
a 532048
14.6%
e 528246
14.5%
i 371986
10.2%
n 350657
9.6%
d 242883
 
6.7%
s 236258
 
6.5%
u 115753
 
3.2%
o 109573
 
3.0%
r 103656
 
2.8%
Other values (21) 434934
11.9%
Uppercase Letter
ValueCountFrequency (%)
S 209349
30.2%
U 194325
28.1%
G 34228
 
4.9%
C 34212
 
4.9%
B 32848
 
4.7%
P 30756
 
4.4%
M 29201
 
4.2%
V 21508
 
3.1%
E 16779
 
2.4%
A 15001
 
2.2%
Other values (15) 74110
 
10.7%
Other Punctuation
ValueCountFrequency (%)
' 865
38.2%
. 756
33.4%
, 645
28.5%
Decimal Number
ValueCountFrequency (%)
6 2
50.0%
8 2
50.0%
Space Separator
ValueCountFrequency (%)
245747
100.0%
Open Punctuation
ValueCountFrequency (%)
( 23
100.0%
Close Punctuation
ValueCountFrequency (%)
) 23
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 21
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4333767
94.6%
Common 248084
 
5.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 615456
14.2%
a 532048
12.3%
e 528246
12.2%
i 371986
 
8.6%
n 350657
 
8.1%
d 242883
 
5.6%
s 236258
 
5.5%
S 209349
 
4.8%
U 194325
 
4.5%
u 115753
 
2.7%
Other values (46) 936806
21.6%
Common
ValueCountFrequency (%)
245747
99.1%
' 865
 
0.3%
. 756
 
0.3%
, 645
 
0.3%
( 23
 
< 0.1%
) 23
 
< 0.1%
- 21
 
< 0.1%
6 2
 
< 0.1%
8 2
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4563119
99.6%
None 18732
 
0.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t 615456
13.5%
a 532048
11.7%
e 528246
11.6%
i 371986
 
8.2%
n 350657
 
7.7%
245747
 
5.4%
d 242883
 
5.3%
s 236258
 
5.2%
S 209349
 
4.6%
U 194325
 
4.3%
Other values (50) 1036164
22.7%
None
ValueCountFrequency (%)
é 16737
89.3%
ô 865
 
4.6%
ç 628
 
3.4%
í 251
 
1.3%
ã 251
 
1.3%

level1Gid
Text

Missing 

Distinct2569
Distinct (%)0.6%
Missing1912772
Missing (%)81.0%
Memory size18.0 MiB
2025-01-08T17:47:04.382591image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length8
Mean length7.592494779
Min length6

Characters and Unicode

Total characters3406760
Distinct characters38
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique311 ?
Unique (%)0.1%

Sample

1st rowUSA.49_1
2nd rowUSA.20_1
3rd rowUSA.32_1
4th rowUSA.38_1
5th rowCRI.2_1
ValueCountFrequency (%)
usa.47_1 28132
 
6.3%
usa.21_1 17503
 
3.9%
usa.34_1 16052
 
3.6%
usa.5_1 12531
 
2.8%
usa.10_1 9880
 
2.2%
usa.49_1 6455
 
1.4%
ven.1_1 6336
 
1.4%
usa.39_1 6190
 
1.4%
usa.6_1 5820
 
1.3%
usa.9_1 5812
 
1.3%
Other values (2559) 333990
74.4%
2025-01-08T17:47:04.637008image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 602900
17.7%
_ 448654
13.2%
. 446732
13.1%
A 259830
 
7.6%
U 243468
 
7.1%
S 209473
 
6.1%
2 119917
 
3.5%
4 112033
 
3.3%
3 88307
 
2.6%
E 69205
 
2.0%
Other values (28) 806241
23.7%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1345474
39.5%
Decimal Number 1165900
34.2%
Connector Punctuation 448654
 
13.2%
Other Punctuation 446732
 
13.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 259830
19.3%
U 243468
18.1%
S 209473
15.6%
E 69205
 
5.1%
N 67135
 
5.0%
R 56358
 
4.2%
C 46945
 
3.5%
G 44614
 
3.3%
M 43301
 
3.2%
B 38446
 
2.9%
Other values (16) 266699
19.8%
Decimal Number
ValueCountFrequency (%)
1 602900
51.7%
2 119917
 
10.3%
4 112033
 
9.6%
3 88307
 
7.6%
5 48668
 
4.2%
7 48244
 
4.1%
9 42418
 
3.6%
6 37250
 
3.2%
8 33151
 
2.8%
0 33012
 
2.8%
Connector Punctuation
ValueCountFrequency (%)
_ 448654
100.0%
Other Punctuation
ValueCountFrequency (%)
. 446732
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2061286
60.5%
Latin 1345474
39.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 259830
19.3%
U 243468
18.1%
S 209473
15.6%
E 69205
 
5.1%
N 67135
 
5.0%
R 56358
 
4.2%
C 46945
 
3.5%
G 44614
 
3.3%
M 43301
 
3.2%
B 38446
 
2.9%
Other values (16) 266699
19.8%
Common
ValueCountFrequency (%)
1 602900
29.2%
_ 448654
21.8%
. 446732
21.7%
2 119917
 
5.8%
4 112033
 
5.4%
3 88307
 
4.3%
5 48668
 
2.4%
7 48244
 
2.3%
9 42418
 
2.1%
6 37250
 
1.8%
Other values (2) 66163
 
3.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3406760
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 602900
17.7%
_ 448654
13.2%
. 446732
13.1%
A 259830
 
7.6%
U 243468
 
7.1%
S 209473
 
6.1%
2 119917
 
3.5%
4 112033
 
3.3%
3 88307
 
2.6%
E 69205
 
2.0%
Other values (28) 806241
23.7%

level1Name
Text

Missing 

Distinct2469
Distinct (%)0.6%
Missing1912766
Missing (%)81.0%
Memory size18.0 MiB
2025-01-08T17:47:04.832423image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length32
Median length29
Mean length9.529755497
Min length3

Characters and Unicode

Total characters4276068
Distinct characters136
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique305 ?
Unique (%)0.1%

Sample

1st rowWest Virginia
2nd rowMaine
3rd rowNew Mexico
4th rowOregon
5th rowCartago
ValueCountFrequency (%)
virginia 34587
 
5.8%
carolina 18805
 
3.2%
maryland 17505
 
2.9%
north 16899
 
2.8%
california 14263
 
2.4%
amazonas 11119
 
1.9%
florida 9889
 
1.7%
new 9352
 
1.6%
columbia 7231
 
1.2%
west 7206
 
1.2%
Other values (2653) 446841
75.3%
2025-01-08T17:47:05.085722image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 626722
14.7%
i 390317
 
9.1%
n 330354
 
7.7%
r 306970
 
7.2%
o 292322
 
6.8%
e 218622
 
5.1%
s 178051
 
4.2%
l 167995
 
3.9%
t 152746
 
3.6%
144990
 
3.4%
Other values (126) 1466979
34.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3489310
81.6%
Uppercase Letter 604684
 
14.1%
Space Separator 144990
 
3.4%
Dash Punctuation 34622
 
0.8%
Other Punctuation 2409
 
0.1%
Modifier Symbol 47
 
< 0.1%
Open Punctuation 3
 
< 0.1%
Close Punctuation 3
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 626722
18.0%
i 390317
11.2%
n 330354
9.5%
r 306970
8.8%
o 292322
8.4%
e 218622
 
6.3%
s 178051
 
5.1%
l 167995
 
4.8%
t 152746
 
4.4%
u 136092
 
3.9%
Other values (76) 689119
19.7%
Uppercase Letter
ValueCountFrequency (%)
C 85346
14.1%
M 61462
 
10.2%
S 45740
 
7.6%
N 43998
 
7.3%
A 41807
 
6.9%
V 40147
 
6.6%
P 34481
 
5.7%
T 33279
 
5.5%
B 24460
 
4.0%
O 20393
 
3.4%
Other values (30) 173571
28.7%
Other Punctuation
ValueCountFrequency (%)
' 1289
53.5%
. 405
 
16.8%
! 336
 
13.9%
/ 270
 
11.2%
, 109
 
4.5%
Space Separator
ValueCountFrequency (%)
144990
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 34622
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 47
100.0%
Open Punctuation
ValueCountFrequency (%)
[ 3
100.0%
Close Punctuation
ValueCountFrequency (%)
] 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4093994
95.7%
Common 182074
 
4.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 626722
15.3%
i 390317
 
9.5%
n 330354
 
8.1%
r 306970
 
7.5%
o 292322
 
7.1%
e 218622
 
5.3%
s 178051
 
4.3%
l 167995
 
4.1%
t 152746
 
3.7%
u 136092
 
3.3%
Other values (116) 1293803
31.6%
Common
ValueCountFrequency (%)
144990
79.6%
- 34622
 
19.0%
' 1289
 
0.7%
. 405
 
0.2%
! 336
 
0.2%
/ 270
 
0.1%
, 109
 
0.1%
` 47
 
< 0.1%
[ 3
 
< 0.1%
] 3
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4231633
99.0%
None 44134
 
1.0%
Latin Ext Additional 301
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 626722
14.8%
i 390317
 
9.2%
n 330354
 
7.8%
r 306970
 
7.3%
o 292322
 
6.9%
e 218622
 
5.2%
s 178051
 
4.2%
l 167995
 
4.0%
t 152746
 
3.6%
144990
 
3.4%
Other values (52) 1422544
33.6%
None
ValueCountFrequency (%)
í 11066
25.1%
á 11012
25.0%
é 7746
17.6%
ó 5432
12.3%
ã 2198
 
5.0%
Î 1381
 
3.1%
ô 965
 
2.2%
ü 753
 
1.7%
ñ 686
 
1.6%
â 598
 
1.4%
Other values (49) 2297
 
5.2%
Latin Ext Additional
ValueCountFrequency (%)
76
25.2%
41
13.6%
37
12.3%
27
 
9.0%
27
 
9.0%
23
 
7.6%
20
 
6.6%
17
 
5.6%
11
 
3.7%
7
 
2.3%
Other values (5) 15
 
5.0%

level2Gid
Text

Missing 

Distinct14207
Distinct (%)3.3%
Missing1927752
Missing (%)81.6%
Memory size18.0 MiB
2025-01-08T17:47:05.283168image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length12
Median length11
Mean length10.17988753
Min length7

Characters and Unicode

Total characters4415231
Distinct characters38
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3239 ?
Unique (%)0.7%

Sample

1st rowUSA.49.42_1
2nd rowUSA.20.10_1
3rd rowUSA.32.8_1
4th rowUSA.38.35_1
5th rowCRI.2.2_1
ValueCountFrequency (%)
usa.9.1_1 5812
 
1.3%
usa.21.15_1 4120
 
0.9%
usa.21.16_1 4057
 
0.9%
guy.8.8_1 3799
 
0.9%
usa.34.87_1 2754
 
0.6%
guy.2.8_1 2722
 
0.6%
guy.10.4_1 2607
 
0.6%
usa.47.40_1 2604
 
0.6%
usa.10.43_1 2474
 
0.6%
usa.47.50_1 2230
 
0.5%
Other values (14197) 400542
92.4%
2025-01-08T17:47:05.539051image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 865426
19.6%
1 701622
15.9%
_ 433720
9.8%
A 257502
 
5.8%
2 256721
 
5.8%
U 241816
 
5.5%
S 207671
 
4.7%
4 180297
 
4.1%
3 160266
 
3.6%
5 113447
 
2.6%
Other values (28) 996743
22.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1815689
41.1%
Uppercase Letter 1300396
29.5%
Other Punctuation 865426
19.6%
Connector Punctuation 433720
 
9.8%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 257502
19.8%
U 241816
18.6%
S 207671
16.0%
E 69062
 
5.3%
N 66514
 
5.1%
R 52484
 
4.0%
C 45749
 
3.5%
G 42495
 
3.3%
M 38963
 
3.0%
B 35575
 
2.7%
Other values (16) 242565
18.7%
Decimal Number
ValueCountFrequency (%)
1 701622
38.6%
2 256721
 
14.1%
4 180297
 
9.9%
3 160266
 
8.8%
5 113447
 
6.2%
7 92449
 
5.1%
6 87379
 
4.8%
8 82143
 
4.5%
9 73044
 
4.0%
0 68321
 
3.8%
Other Punctuation
ValueCountFrequency (%)
. 865426
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 433720
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 3114835
70.5%
Latin 1300396
29.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 257502
19.8%
U 241816
18.6%
S 207671
16.0%
E 69062
 
5.3%
N 66514
 
5.1%
R 52484
 
4.0%
C 45749
 
3.5%
G 42495
 
3.3%
M 38963
 
3.0%
B 35575
 
2.7%
Other values (16) 242565
18.7%
Common
ValueCountFrequency (%)
. 865426
27.8%
1 701622
22.5%
_ 433720
13.9%
2 256721
 
8.2%
4 180297
 
5.8%
3 160266
 
5.1%
5 113447
 
3.6%
7 92449
 
3.0%
6 87379
 
2.8%
8 82143
 
2.6%
Other values (2) 141365
 
4.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4415231
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 865426
19.6%
1 701622
15.9%
_ 433720
9.8%
A 257502
 
5.8%
2 256721
 
5.8%
U 241816
 
5.5%
S 207671
 
4.7%
4 180297
 
4.1%
3 160266
 
3.6%
5 113447
 
2.6%
Other values (28) 996743
22.6%

level2Name
Text

Missing 

Distinct12304
Distinct (%)2.8%
Missing1927850
Missing (%)81.6%
Memory size18.0 MiB
2025-01-08T17:47:05.732592image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length32
Median length28
Mean length9.255454623
Min length1

Characters and Unicode

Total characters4013378
Distinct characters171
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2893 ?
Unique (%)0.7%

Sample

1st rowRandolph
2nd rowPenobscot
3rd rowDona Ana
4th rowWheeler
5th rowCartago
ValueCountFrequency (%)
of 16654
 
2.7%
rest 9994
 
1.6%
region 9986
 
1.6%
san 8897
 
1.4%
de 7619
 
1.2%
columbia 6006
 
1.0%
district 5938
 
1.0%
prince 5814
 
0.9%
montgomery 5411
 
0.9%
4736
 
0.8%
Other values (12309) 538341
86.9%
2025-01-08T17:47:05.990852image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 472400
 
11.8%
o 308144
 
7.7%
e 303215
 
7.6%
n 287418
 
7.2%
i 252180
 
6.3%
r 237397
 
5.9%
185773
 
4.6%
t 161195
 
4.0%
l 160373
 
4.0%
s 140991
 
3.5%
Other values (161) 1504292
37.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3194580
79.6%
Uppercase Letter 582861
 
14.5%
Space Separator 185773
 
4.6%
Dash Punctuation 15579
 
0.4%
Decimal Number 14276
 
0.4%
Other Punctuation 14257
 
0.4%
Open Punctuation 3087
 
0.1%
Close Punctuation 1816
 
< 0.1%
Math Symbol 1131
 
< 0.1%
Modifier Symbol 18
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 472400
14.8%
o 308144
9.6%
e 303215
9.5%
n 287418
 
9.0%
i 252180
 
7.9%
r 237397
 
7.4%
t 161195
 
5.0%
l 160373
 
5.0%
s 140991
 
4.4%
u 135057
 
4.2%
Other values (91) 736210
23.0%
Uppercase Letter
ValueCountFrequency (%)
C 66713
 
11.4%
S 55503
 
9.5%
M 50752
 
8.7%
R 40959
 
7.0%
P 36962
 
6.3%
B 35720
 
6.1%
A 34671
 
5.9%
L 26794
 
4.6%
G 25252
 
4.3%
D 24677
 
4.2%
Other values (37) 184858
31.7%
Decimal Number
ValueCountFrequency (%)
8 3930
27.5%
7 2826
19.8%
9 2732
19.1%
1 2623
18.4%
0 980
 
6.9%
2 317
 
2.2%
3 305
 
2.1%
6 265
 
1.9%
5 228
 
1.6%
4 70
 
0.5%
Other Punctuation
ValueCountFrequency (%)
' 6045
42.4%
. 4259
29.9%
/ 2092
 
14.7%
, 1702
 
11.9%
& 91
 
0.6%
? 62
 
0.4%
# 6
 
< 0.1%
Space Separator
ValueCountFrequency (%)
185773
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 15579
100.0%
Open Punctuation
ValueCountFrequency (%)
( 3087
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1816
100.0%
Math Symbol
ValueCountFrequency (%)
+ 1131
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 18
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3777441
94.1%
Common 235937
 
5.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 472400
 
12.5%
o 308144
 
8.2%
e 303215
 
8.0%
n 287418
 
7.6%
i 252180
 
6.7%
r 237397
 
6.3%
t 161195
 
4.3%
l 160373
 
4.2%
s 140991
 
3.7%
u 135057
 
3.6%
Other values (138) 1319071
34.9%
Common
ValueCountFrequency (%)
185773
78.7%
- 15579
 
6.6%
' 6045
 
2.6%
. 4259
 
1.8%
8 3930
 
1.7%
( 3087
 
1.3%
7 2826
 
1.2%
9 2732
 
1.2%
1 2623
 
1.1%
/ 2092
 
0.9%
Other values (13) 6991
 
3.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3963042
98.7%
None 50010
 
1.2%
Latin Ext Additional 326
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 472400
 
11.9%
o 308144
 
7.8%
e 303215
 
7.7%
n 287418
 
7.3%
i 252180
 
6.4%
r 237397
 
6.0%
185773
 
4.7%
t 161195
 
4.1%
l 160373
 
4.0%
s 140991
 
3.6%
Other values (65) 1453956
36.7%
None
ValueCountFrequency (%)
ó 9474
18.9%
í 9371
18.7%
á 9248
18.5%
é 9176
18.3%
ã 2578
 
5.2%
ñ 2379
 
4.8%
ú 1522
 
3.0%
ê 1021
 
2.0%
ü 956
 
1.9%
ç 746
 
1.5%
Other values (63) 3539
 
7.1%
Latin Ext Additional
ValueCountFrequency (%)
67
20.6%
67
20.6%
ế 33
10.1%
29
8.9%
15
 
4.6%
15
 
4.6%
15
 
4.6%
14
 
4.3%
13
 
4.0%
12
 
3.7%
Other values (13) 46
14.1%

level3Gid
Text

Missing 

Distinct8201
Distinct (%)8.0%
Missing2259567
Missing (%)95.7%
Memory size18.0 MiB
2025-01-08T17:47:06.179188image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length36
Median length22
Mean length11.78068023
Min length11

Characters and Unicode

Total characters1200522
Distinct characters48
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2819 ?
Unique (%)2.8%

Sample

1st rowCRI.2.2.4_1
2nd rowIND.19.16.3_1
3rd rowCHN.30.7.7_1
4th rowCRI.7.10.3_1
5th rowRUS.61.13.1_1
ValueCountFrequency (%)
can.13.1.35_1 1996
 
2.0%
per.18.3.4_1 1086
 
1.1%
per.8.9.1_1 918
 
0.9%
per.1.4.3_1 869
 
0.9%
pan.4.2.4_1 817
 
0.8%
pan.4.2.6_1 809
 
0.8%
mdg.2.1.5_1 704
 
0.7%
cri.5.2.1_1 568
 
0.6%
mdg.6.2.3_1 521
 
0.5%
per.18.1.1_1 500
 
0.5%
Other values (8193) 93120
91.4%
2025-01-08T17:47:06.422856image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 305698
25.5%
1 204681
17.0%
_ 101899
 
8.5%
2 69213
 
5.8%
3 45164
 
3.8%
4 42456
 
3.5%
C 35435
 
3.0%
E 30901
 
2.6%
5 30366
 
2.5%
A 29431
 
2.5%
Other values (38) 305278
25.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 487863
40.6%
Other Punctuation 305698
25.5%
Uppercase Letter 304935
25.4%
Connector Punctuation 101899
 
8.5%
Lowercase Letter 101
 
< 0.1%
Dash Punctuation 24
 
< 0.1%
Space Separator 2
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
C 35435
11.6%
E 30901
 
10.1%
A 29431
 
9.7%
N 29194
 
9.6%
R 22473
 
7.4%
P 21127
 
6.9%
H 15472
 
5.1%
U 15163
 
5.0%
L 14604
 
4.8%
I 13610
 
4.5%
Other values (13) 77525
25.4%
Lowercase Letter
ValueCountFrequency (%)
a 28
27.7%
c 25
24.8%
b 18
17.8%
d 12
11.9%
e 9
 
8.9%
r 2
 
2.0%
i 2
 
2.0%
l 2
 
2.0%
s 1
 
1.0%
m 1
 
1.0%
Decimal Number
ValueCountFrequency (%)
1 204681
42.0%
2 69213
 
14.2%
3 45164
 
9.3%
4 42456
 
8.7%
5 30366
 
6.2%
6 25988
 
5.3%
8 22572
 
4.6%
9 17190
 
3.5%
7 16841
 
3.5%
0 13392
 
2.7%
Other Punctuation
ValueCountFrequency (%)
. 305698
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 101899
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 24
100.0%
Space Separator
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 895486
74.6%
Latin 305036
 
25.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
C 35435
11.6%
E 30901
 
10.1%
A 29431
 
9.6%
N 29194
 
9.6%
R 22473
 
7.4%
P 21127
 
6.9%
H 15472
 
5.1%
U 15163
 
5.0%
L 14604
 
4.8%
I 13610
 
4.5%
Other values (24) 77626
25.4%
Common
ValueCountFrequency (%)
. 305698
34.1%
1 204681
22.9%
_ 101899
 
11.4%
2 69213
 
7.7%
3 45164
 
5.0%
4 42456
 
4.7%
5 30366
 
3.4%
6 25988
 
2.9%
8 22572
 
2.5%
9 17190
 
1.9%
Other values (4) 30259
 
3.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1200522
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 305698
25.5%
1 204681
17.0%
_ 101899
 
8.5%
2 69213
 
5.8%
3 45164
 
3.8%
4 42456
 
3.5%
C 35435
 
3.0%
E 30901
 
2.6%
5 30366
 
2.5%
A 29431
 
2.5%
Other values (38) 305278
25.4%

level3Name
Text

Missing 

Distinct7685
Distinct (%)7.6%
Missing2260777
Missing (%)95.7%
Memory size18.0 MiB
2025-01-08T17:47:06.611170image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length32
Median length28
Mean length10.13161397
Min length2

Characters and Unicode

Total characters1020213
Distinct characters138
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2589 ?
Unique (%)2.6%

Sample

1st rowDulce Nombre
2nd rowKukshi
3rd rowLunan
4th rowSan Pedro
5th rowBan Luang
ValueCountFrequency (%)
unorganized 3367
 
2.2%
san 3200
 
2.1%
de 3074
 
2.0%
yukon 1996
 
1.3%
el 1944
 
1.2%
santa 1489
 
1.0%
la 1389
 
0.9%
rio 1264
 
0.8%
no 1168
 
0.8%
tambopata 1086
 
0.7%
Other values (7998) 135629
87.2%
2025-01-08T17:47:06.862439image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 141798
 
13.9%
o 73803
 
7.2%
n 72305
 
7.1%
i 64581
 
6.3%
e 60134
 
5.9%
54910
 
5.4%
r 52655
 
5.2%
u 39382
 
3.9%
l 35764
 
3.5%
t 33716
 
3.3%
Other values (128) 391165
38.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 787258
77.2%
Uppercase Letter 151480
 
14.8%
Space Separator 54910
 
5.4%
Other Punctuation 10121
 
1.0%
Decimal Number 6298
 
0.6%
Open Punctuation 4014
 
0.4%
Close Punctuation 3325
 
0.3%
Dash Punctuation 2796
 
0.3%
Final Punctuation 11
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 141798
18.0%
o 73803
 
9.4%
n 72305
 
9.2%
i 64581
 
8.2%
e 60134
 
7.6%
r 52655
 
6.7%
u 39382
 
5.0%
l 35764
 
4.5%
t 33716
 
4.3%
s 25651
 
3.3%
Other values (72) 187469
23.8%
Uppercase Letter
ValueCountFrequency (%)
S 15511
 
10.2%
C 14824
 
9.8%
B 10184
 
6.7%
T 9920
 
6.5%
P 9627
 
6.4%
M 9584
 
6.3%
A 8905
 
5.9%
L 7661
 
5.1%
N 6992
 
4.6%
K 6215
 
4.1%
Other values (24) 52057
34.4%
Decimal Number
ValueCountFrequency (%)
1 1989
31.6%
2 877
13.9%
3 556
 
8.8%
4 506
 
8.0%
9 485
 
7.7%
5 463
 
7.4%
0 435
 
6.9%
6 404
 
6.4%
8 295
 
4.7%
7 288
 
4.6%
Other Punctuation
ValueCountFrequency (%)
. 4784
47.3%
, 4425
43.7%
/ 358
 
3.5%
' 346
 
3.4%
! 191
 
1.9%
: 11
 
0.1%
" 6
 
0.1%
Space Separator
ValueCountFrequency (%)
54910
100.0%
Open Punctuation
ValueCountFrequency (%)
( 4014
100.0%
Close Punctuation
ValueCountFrequency (%)
) 3325
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2796
100.0%
Final Punctuation
ValueCountFrequency (%)
11
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 938738
92.0%
Common 81475
 
8.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 141798
15.1%
o 73803
 
7.9%
n 72305
 
7.7%
i 64581
 
6.9%
e 60134
 
6.4%
r 52655
 
5.6%
u 39382
 
4.2%
l 35764
 
3.8%
t 33716
 
3.6%
s 25651
 
2.7%
Other values (106) 338949
36.1%
Common
ValueCountFrequency (%)
54910
67.4%
. 4784
 
5.9%
, 4425
 
5.4%
( 4014
 
4.9%
) 3325
 
4.1%
- 2796
 
3.4%
1 1989
 
2.4%
2 877
 
1.1%
3 556
 
0.7%
4 506
 
0.6%
Other values (12) 3293
 
4.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1010607
99.1%
None 9254
 
0.9%
Latin Ext Additional 341
 
< 0.1%
Punctuation 11
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 141798
 
14.0%
o 73803
 
7.3%
n 72305
 
7.2%
i 64581
 
6.4%
e 60134
 
6.0%
54910
 
5.4%
r 52655
 
5.2%
u 39382
 
3.9%
l 35764
 
3.5%
t 33716
 
3.3%
Other values (63) 381559
37.8%
None
ValueCountFrequency (%)
á 1795
19.4%
é 1690
18.3%
ó 1511
16.3%
ñ 1432
15.5%
í 927
10.0%
ê 371
 
4.0%
è 288
 
3.1%
ü 183
 
2.0%
à 156
 
1.7%
â 120
 
1.3%
Other values (31) 781
8.4%
Latin Ext Additional
ValueCountFrequency (%)
58
17.0%
ế 40
11.7%
30
 
8.8%
24
 
7.0%
22
 
6.5%
21
 
6.2%
21
 
6.2%
17
 
5.0%
16
 
4.7%
16
 
4.7%
Other values (13) 76
22.3%
Punctuation
ValueCountFrequency (%)
11
100.0%

iucnRedListCategory
Text

Missing 

Distinct15
Distinct (%)< 0.1%
Missing383090
Missing (%)16.2%
Memory size18.0 MiB
2025-01-08T17:47:06.917520image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length24
Median length2
Mean length2.000066721
Min length2

Characters and Unicode

Total characters3956898
Distinct characters25
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6 ?
Unique (%)< 0.1%

Sample

1st rowNE
2nd rowNE
3rd rowLC
4th rowNE
5th rowNE
ValueCountFrequency (%)
ne 1310250
66.2%
lc 593364
30.0%
vu 24743
 
1.3%
nt 19503
 
1.0%
en 12442
 
0.6%
dd 10871
 
0.5%
cr 6368
 
0.3%
ex 663
 
< 0.1%
ew 173
 
< 0.1%
2024-12-02t13:57:00.684z 1
 
< 0.1%
Other values (5) 5
 
< 0.1%
2025-01-08T17:47:07.011524image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
N 1342195
33.9%
E 1323528
33.4%
C 599732
15.2%
L 593364
15.0%
V 24743
 
0.6%
U 24743
 
0.6%
D 21742
 
0.5%
T 19509
 
0.5%
R 6368
 
0.2%
X 663
 
< 0.1%
Other values (15) 311
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 3956766
> 99.9%
Decimal Number 102
 
< 0.1%
Other Punctuation 18
 
< 0.1%
Dash Punctuation 12
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N 1342195
33.9%
E 1323528
33.4%
C 599732
15.2%
L 593364
15.0%
V 24743
 
0.6%
U 24743
 
0.6%
D 21742
 
0.5%
T 19509
 
0.5%
R 6368
 
0.2%
X 663
 
< 0.1%
Other values (2) 179
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
2 28
27.5%
0 16
15.7%
1 14
13.7%
5 11
 
10.8%
3 10
 
9.8%
4 8
 
7.8%
8 5
 
4.9%
6 4
 
3.9%
7 3
 
2.9%
9 3
 
2.9%
Other Punctuation
ValueCountFrequency (%)
: 12
66.7%
. 6
33.3%
Dash Punctuation
ValueCountFrequency (%)
- 12
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3956766
> 99.9%
Common 132
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
2 28
21.2%
0 16
12.1%
1 14
10.6%
- 12
9.1%
: 12
9.1%
5 11
 
8.3%
3 10
 
7.6%
4 8
 
6.1%
. 6
 
4.5%
8 5
 
3.8%
Other values (3) 10
 
7.6%
Latin
ValueCountFrequency (%)
N 1342195
33.9%
E 1323528
33.4%
C 599732
15.2%
L 593364
15.0%
V 24743
 
0.6%
U 24743
 
0.6%
D 21742
 
0.5%
T 19509
 
0.5%
R 6368
 
0.2%
X 663
 
< 0.1%
Other values (2) 179
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3956898
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 1342195
33.9%
E 1323528
33.4%
C 599732
15.2%
L 593364
15.0%
V 24743
 
0.6%
U 24743
 
0.6%
D 21742
 
0.5%
T 19509
 
0.5%
R 6368
 
0.2%
X 663
 
< 0.1%
Other values (15) 311
 
< 0.1%